Re: [R] mixed-effects models with (g)lmer in R and model selection

2016-02-19 Thread Bert Gunter
Absolutely!  Even more, consult a local expert in applying mixed effects
models. The op's strategy sounded to me like a prescription to produce
irreproducible results (due to over fitting).

Cheers,
Bert



On Friday, February 19, 2016, Don McKenzie  wrote:

> This is a complicated and subtle statistical issue, not an R question, the
> latter being the purpose of this list.  There are people on the list who
> could give you literate answers,
> to be sure, but a statistically oriented list would be a better match.
>
> e.g.,
>
> http://stats.stackexchange.com/
>
>
> > On Feb 19, 2016, at 5:01 AM, Wilbert Heeringa  > wrote:
> >
> > Dear all,
> >
> > Mixed-effects models are wonderful for analyzing data, but it is always a
> > hassle to find the best model, i.e. the model with the lowest AIC,
> > especially when the number of predictor variables is large.
> >
> > Presently when trying to find the right model, I perform the following
> > steps:
> >
> >   1.
> >
> >   Start with a model containing all predictors. Assuming dependent
> >   variable X and predictors A, B, C, D, E, I start with: X~A+B+C+D+E
> >   2.
> >
> >   Lmer warns that is has dropped columns/coefficients. These are
> variables
> >   which have a *perfect* correlation with any of the other variables or
> >   with a combination of variables. With summary() it can be found which
> >   columns have been dropped. Assume predictor D has been dropped, I
> continue
> >   with this model: X~A+B+C+E
> >   3.
> >
> >   Subsequently I need to check whether there are variables (or groups of
> >   variables) which *strongly* corrrelate to each other. I included the
> >   function vif.mer (developed by Austin F. Frank and available at:
> >   https://raw.github.com/aufrank/R-hacks/master/mer-utils.R) in my
> script,
> >   and when applying this function to my reduced model, I got vif values
> for
> >   each of the variables. When vif>5 for a predictor, it probably should
> be
> >   removed. In case multiple variables have a vif>5, I first remove the
> >   predictor with the highest vif, then re-run lmer en vif.mer. I remove
> again
> >   the predictor with highest vif (if one or more predictors have still a
> >   vif>5), and I repeat this until none of the remaining predictors has a
> >   vif>5. In case I got a warning "Model failed to converge" in the larger
> >   model(s), this warning does not appear any longer in the 'cleaned'
> model.
> >   4.
> >
> >   Assume the following predictors have survived: A, B en E. Now I want to
> >   find the combination of predictors that gives the smallest AIC. For
> three
> >   predictors it is easy to try all combinations, but if it would have
> been 10
> >   predictors, manually trying all combinations would be time-consuming.
> So I
> >   used the function fitLMER.fnc from the LMERConvenienceFunctions
> package.
> >   This function back fit fixed effects, forward fit random effects, and
> >   re-back fit fixed effects. I consider the model given by fitLMER.fnc
> as the
> >   right one.
> >
> > I am not an expert in mixed-effects models and have struggled with model
> > selection. I found the procedure which I decribed working, but I would
> > really be appreciate to hear whether the procedure is sound, or whether
> > there are better alternatives.
> >
> > Best,
> >
> > Wilbert
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org  mailing list -- To UNSUBSCRIBE and
> more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> __
> R-help@r-project.org  mailing list -- To UNSUBSCRIBE and
> more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mixed-effects models with (g)lmer in R and model selection

2016-02-19 Thread Jianling Fan
Hello, Wilbert,

You did give a good procedure for lme model selection! thanks! I learn some.
I am also working on similar problem recently, maybe you can take a
look at "glmmLasso" package, which allows model selection in
generalized linear mixed effects models using the LASSO shrinkage
method.


Regards,

Jianling

On 19 February 2016 at 07:01, Wilbert Heeringa  wrote:
> Dear all,
>
> Mixed-effects models are wonderful for analyzing data, but it is always a
> hassle to find the best model, i.e. the model with the lowest AIC,
> especially when the number of predictor variables is large.
>
> Presently when trying to find the right model, I perform the following
> steps:
>
>1.
>
>Start with a model containing all predictors. Assuming dependent
>variable X and predictors A, B, C, D, E, I start with: X~A+B+C+D+E
>2.
>
>Lmer warns that is has dropped columns/coefficients. These are variables
>which have a *perfect* correlation with any of the other variables or
>with a combination of variables. With summary() it can be found which
>columns have been dropped. Assume predictor D has been dropped, I continue
>with this model: X~A+B+C+E
>3.
>
>Subsequently I need to check whether there are variables (or groups of
>variables) which *strongly* corrrelate to each other. I included the
>function vif.mer (developed by Austin F. Frank and available at:
>https://raw.github.com/aufrank/R-hacks/master/mer-utils.R) in my script,
>and when applying this function to my reduced model, I got vif values for
>each of the variables. When vif>5 for a predictor, it probably should be
>removed. In case multiple variables have a vif>5, I first remove the
>predictor with the highest vif, then re-run lmer en vif.mer. I remove again
>the predictor with highest vif (if one or more predictors have still a
>vif>5), and I repeat this until none of the remaining predictors has a
>vif>5. In case I got a warning "Model failed to converge" in the larger
>model(s), this warning does not appear any longer in the 'cleaned' model.
>4.
>
>Assume the following predictors have survived: A, B en E. Now I want to
>find the combination of predictors that gives the smallest AIC. For three
>predictors it is easy to try all combinations, but if it would have been 10
>predictors, manually trying all combinations would be time-consuming. So I
>used the function fitLMER.fnc from the LMERConvenienceFunctions package.
>This function back fit fixed effects, forward fit random effects, and
>re-back fit fixed effects. I consider the model given by fitLMER.fnc as the
>right one.
>
> I am not an expert in mixed-effects models and have struggled with model
> selection. I found the procedure which I decribed working, but I would
> really be appreciate to hear whether the procedure is sound, or whether
> there are better alternatives.
>
> Best,
>
> Wilbert
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mixed-effects models with (g)lmer in R and model selection

2016-02-19 Thread Don McKenzie
This is a complicated and subtle statistical issue, not an R question, the 
latter being the purpose of this list.  There are people on the list who could 
give you literate answers,
to be sure, but a statistically oriented list would be a better match.

e.g., 

http://stats.stackexchange.com/


> On Feb 19, 2016, at 5:01 AM, Wilbert Heeringa  wrote:
> 
> Dear all,
> 
> Mixed-effects models are wonderful for analyzing data, but it is always a
> hassle to find the best model, i.e. the model with the lowest AIC,
> especially when the number of predictor variables is large.
> 
> Presently when trying to find the right model, I perform the following
> steps:
> 
>   1.
> 
>   Start with a model containing all predictors. Assuming dependent
>   variable X and predictors A, B, C, D, E, I start with: X~A+B+C+D+E
>   2.
> 
>   Lmer warns that is has dropped columns/coefficients. These are variables
>   which have a *perfect* correlation with any of the other variables or
>   with a combination of variables. With summary() it can be found which
>   columns have been dropped. Assume predictor D has been dropped, I continue
>   with this model: X~A+B+C+E
>   3.
> 
>   Subsequently I need to check whether there are variables (or groups of
>   variables) which *strongly* corrrelate to each other. I included the
>   function vif.mer (developed by Austin F. Frank and available at:
>   https://raw.github.com/aufrank/R-hacks/master/mer-utils.R) in my script,
>   and when applying this function to my reduced model, I got vif values for
>   each of the variables. When vif>5 for a predictor, it probably should be
>   removed. In case multiple variables have a vif>5, I first remove the
>   predictor with the highest vif, then re-run lmer en vif.mer. I remove again
>   the predictor with highest vif (if one or more predictors have still a
>   vif>5), and I repeat this until none of the remaining predictors has a
>   vif>5. In case I got a warning "Model failed to converge" in the larger
>   model(s), this warning does not appear any longer in the 'cleaned' model.
>   4.
> 
>   Assume the following predictors have survived: A, B en E. Now I want to
>   find the combination of predictors that gives the smallest AIC. For three
>   predictors it is easy to try all combinations, but if it would have been 10
>   predictors, manually trying all combinations would be time-consuming. So I
>   used the function fitLMER.fnc from the LMERConvenienceFunctions package.
>   This function back fit fixed effects, forward fit random effects, and
>   re-back fit fixed effects. I consider the model given by fitLMER.fnc as the
>   right one.
> 
> I am not an expert in mixed-effects models and have struggled with model
> selection. I found the procedure which I decribed working, but I would
> really be appreciate to hear whether the procedure is sound, or whether
> there are better alternatives.
> 
> Best,
> 
> Wilbert
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mixed-effects models with (g)lmer in R and model selection

2016-02-19 Thread Wilbert Heeringa
Dear all,

Mixed-effects models are wonderful for analyzing data, but it is always a
hassle to find the best model, i.e. the model with the lowest AIC,
especially when the number of predictor variables is large.

Presently when trying to find the right model, I perform the following
steps:

   1.

   Start with a model containing all predictors. Assuming dependent
   variable X and predictors A, B, C, D, E, I start with: X~A+B+C+D+E
   2.

   Lmer warns that is has dropped columns/coefficients. These are variables
   which have a *perfect* correlation with any of the other variables or
   with a combination of variables. With summary() it can be found which
   columns have been dropped. Assume predictor D has been dropped, I continue
   with this model: X~A+B+C+E
   3.

   Subsequently I need to check whether there are variables (or groups of
   variables) which *strongly* corrrelate to each other. I included the
   function vif.mer (developed by Austin F. Frank and available at:
   https://raw.github.com/aufrank/R-hacks/master/mer-utils.R) in my script,
   and when applying this function to my reduced model, I got vif values for
   each of the variables. When vif>5 for a predictor, it probably should be
   removed. In case multiple variables have a vif>5, I first remove the
   predictor with the highest vif, then re-run lmer en vif.mer. I remove again
   the predictor with highest vif (if one or more predictors have still a
   vif>5), and I repeat this until none of the remaining predictors has a
   vif>5. In case I got a warning "Model failed to converge" in the larger
   model(s), this warning does not appear any longer in the 'cleaned' model.
   4.

   Assume the following predictors have survived: A, B en E. Now I want to
   find the combination of predictors that gives the smallest AIC. For three
   predictors it is easy to try all combinations, but if it would have been 10
   predictors, manually trying all combinations would be time-consuming. So I
   used the function fitLMER.fnc from the LMERConvenienceFunctions package.
   This function back fit fixed effects, forward fit random effects, and
   re-back fit fixed effects. I consider the model given by fitLMER.fnc as the
   right one.

I am not an expert in mixed-effects models and have struggled with model
selection. I found the procedure which I decribed working, but I would
really be appreciate to hear whether the procedure is sound, or whether
there are better alternatives.

Best,

Wilbert

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.