Quick Tutorial
Consider the sleepstudy
dataset (Belenky et al., 2003), which is from a sleep deprivation study that measured the average reaction time from 18 subjects over 10 days.
using MixedModelsBLB, CSV, DataFrames, StatsModels, Random, Ipopt
sleepstudy = CSV.read("../../../test/data/sleepstudy.csv", DataFrame);
first(sleepstudy, 5)
<div class="data-frame"><p>5 rows × 3 columns</p><table class="data-frame"><thead><tr><th></th><th>Reaction</th><th>Days</th><th>id</th></tr><tr><th></th><th title="Float64">Float64</th><th title="Int64">Int64</th><th title="Int64">Int64</th></tr></thead><tbody><tr><th>1</th><td>249.56</td><td>0</td><td>308</td></tr><tr><th>2</th><td>258.705</td><td>1</td><td>308</td></tr><tr><th>3</th><td>250.801</td><td>2</td><td>308</td></tr><tr><th>4</th><td>321.44</td><td>3</td><td>308</td></tr><tr><th>5</th><td>356.852</td><td>4</td><td>308</td></tr></tbody></table></div>
To fit a Linear Mixed Model (LMM) of the form
\[$$\text{Reaction} ~ \text{Days} + (1|\text{ID})},$$\]
and perform statistical inference using the Bag of Little Bootstraps (BLB), we use the following code:
blb_ests = blb_full_data(
MersenneTwister(1),
sleepstudy;
feformula = @formula(Reaction ~ 1 + Days),
reformula = @formula(Reaction ~ 1),
id_name = "id",
cat_names = Array{String,1}(),
subset_size = 10,
n_subsets = 20,
n_boots = 500,
method = :ML,
solver = Ipopt.IpoptSolver(print_level=0),
verbose = false,
nonparametric_boot = true
);
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit https://github.com/coin-or/Ipopt
******************************************************************************
In this chunk,
MersenneTwister(1)
sets the random seed for subsetting and resampling.feformula
andreformula
specify the fixed and random effect formula, respectively.id_name
is the name of the grouping factor such as subject ID.cat_names
is a vector of categorical variable names. If there aren't any, simply set it as we did above.subset_size
,n_subsets
, andn_boots
are BLB parameters. Typically, we recommend settingsubset_size
= $N^{0.6}$ or $N^{0.7}$, where $N$ is the total number of subjects.n_subsets
= 10-20.n_boots
= 500-2000
- Ipopt is a freely-available gradient-based solver and works quite well. Knitro is 3-5 times faster than Ipopt but requires a liscense (you might be eligible for an academic liscense).
To see the result, type
print(blb_ests)
Bag of Little Boostrap (BLB) for linear mixed models.
Method: ML
Number of subsets: 20
Number of grouping factors per subset: 10
Number of bootstrap samples per subset: 500
Confidence interval level: 95%
Variance Components parameters
─────────────────────────────────────────
Estimate CI Lower CI Upper
─────────────────────────────────────────
(Intercept) 1289.92 463.5 2187.22
Residual 855.397 541.333 1227.79
─────────────────────────────────────────
Fixed-effect parameters
──────────────────────────────────────────
Estimate CI Lower CI Upper
──────────────────────────────────────────
(Intercept) 248.978 235.696 262.032
Days 10.4608 7.73577 13.2131
──────────────────────────────────────────
Results are displayed in two tables, showing the BLB estimates and confidence intervals for both fixed effect and variance components parameters.
Reference
Gregory Belenky, Nancy J. Wesensten, David R. Thorne, Maria L. Thomas, Helen C. Sing, Daniel P. Redmond, Michael B. Russo and Thomas J. Balkin (2003) Patterns of performance degradation and restoration during sleep restriction and subsequent recovery: a sleep dose-response study. Journal of Sleep Research 12, 1–12.