zubin <binabina <at> bellsouth.net> writes: > > Hello, running a mixed model in the package LME4, lmer() > > Panel data, have about 322 time periods and 50 states, total data set is > approx 15K records and about 20 explanatory variables. Not a very > large data set. > > We run random intercepts as well as random coefficients for about 10 of > the variables, the rest come in as fixed effects. We are running into > a wall of time to execute these models. > > A sample specification of all random effects: > > lmer(Y ~ 1 + (x_078 + x_079 + growth_st_index + > retail_st_index + Natl + econ_home_st_index + > econ_bankruptcy + index2_HO + GPND_ST | state), > data = newData, doFit = TRUE) > > Computation time is near 15 minutes. > System ELAPSED User > 21.4 888.63 701.74 > > Does anyone have any ideas on way's to speed up lmer(), as well any > parallel implementations, or approaches/options to reduce computation time? > >
(1) these kinds of questions will probably get more informed answers on the r-sig-mixed-models list. Please direct follow-ups there. (2) I'm not really sure whether this counts as "large" in the mixed/ multilevel model world. It's certainly not very large for a standard linear regression. For comparison, the 'Chem97' dataset in the mlmRev package is 31022 observations x 8 variables x 2280 blocks and is described as "relatively large" -- so the raw data matrix is about the same size (twice as long, half as wide) but there are many more blocks. (3) Fitting 10 random effects (including the intercept) is very ambitious, it leads to the estimation of a 10x10 correlation matrix ... I don't know whether you know that's what you're doing, or whether you need the full correlation matrix. You can split it up into independent blocks (in the extreme, 10 uncorrelated random effects) by specifying the REs as separate chunks, e.g. (1|state) + (0+x_078|state) + (0|x_079|state) + ... (see some of the examples in the lmer documentation). (lme, in the nlme package, offers more flexibility in specifying structured correlation matrices of different types, but will in general be slower than lme4 -- but perhaps it would be faster to fit a structured (simpler) model you're happy with using lme than the full unstructured model using lmer) (4) the development version of lme4, lme4a, *might* be faster (but is less well tested/less stable). (5) do you have alternatives? I haven't worked with data sets this size myself, but anecdotes on the r-sig-mixed-models list suggest that lmer is faster than most alternatives ... ? Ben Bolker ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.