Re: [R] glmnet inclusion / exclusion of categorical variables

David Winsemius Fri, 09 Aug 2013 12:14:24 -0700

On Aug 9, 2013, at 6:44 AM, Kevin Shaney wrote:

> 
> Hello -
> 
> I have been using GLMNET of the following form to predict multinomial 
> logistic / class dependent variables:
> 
> mglmnet=glmnet(xxb,yb ,alpha=ty,dfmax=dfm,
> family="multinomial",standardize=FALSE)
> 
> I am using both continuous and categorical variables as predictors, and am 
> using sparse.model.matrix to code my x's into a matrix.  This is changing an 
> example categorical variable whose original name / values is {V1 = "1" or "2" 
> or "3"} into two recoded variables {V12= "1" or "0" and V13 = "1" or "0"}.


You set their penalty factors to be 0 to at least observe the case where 
inclusion is performed. And setting the penallty factor for both to be small 
would allow you to "honestly" use 0 as the estimated coefficient in such cases 
where one was estimated and the other not. 

> 
> As i am cycling through different penalties, i would like to either have both 
> recoded variables included or both excluded, but not one included - and
> can't figure out how to make that work.   I tried changing the
> "type.multinomial" option, as that looks like this option should do what i 
> want, but can't get it to work (maybe the difference in recoded variable 
> names is driving this).

Doesn't the 'family' argument, used to set what I think you are calling 'type', 
just refer to the y argument, rather  than the predictors. You may want:

   mglmnet=glmnet(xxb,yb ,alpha=ty,dfmax=dfm, type.multinomial="grouped",
                 family="multinomial",standardize=FALSE)

> 
> To summarize, for categorical variables, i would like to hierarchically 
> constrain inclusion / exclusion of recoded variables in the model - either 
> all of the recoded variables from the same original categorical  variable are 
> in, or all are out.

I do understand that I am possibly not directly answering your question, but in 
some respect I wonder if it deserves an answer. I think it is meaningful if 
some factor levels are "penalized-out" of models.

-- 
David Winsemius
Alameda, CA, USA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glmnet inclusion / exclusion of categorical variables

Reply via email to