Hello,

I am attempting to use elasticnet to classify a number of documents.

The features are words.  The data is coded into a matrix with each document as 
a row and each word as a column.  The data is binary, with {0,1} indicating the 
presence of a word.  

I want to use the cross validation function of elasticnet (cv.enet).  However, 
when the code selects a random subset of the data for a given run, some of the 
word columns may be all 0.  (A given word simply isn't present in the subset of 
data sampled.)  This causes the the function to return an error about variance 
of 0.

Any suggestions on how to mitigate this issue?  Given that I want a 5-fold 
cross validation to determine optimal tuning?


Thanks!


--
Noah Silverman, M.S.
UCLA Department of Statistics
8117 Math Sciences Building
Los Angeles, CA 90095

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to