Hi all, I am using the glmnet R package to run LASSO with binary logistic regression. I have over 290 samples with outcome data (0 for alive, 1 for dead) and over 230 predictor variables. I currently using LASSO to reduce the number of predictor variables.
I am using the cv.glmnet function to do 10-fold cross validation on a sequence of lambda values which I let glmnet determine. I then take the optimal lambda value (lambda.1se) which I then use to predict on an independent cohort. What I am finding is that this optimal lambda value fluctuates everytime I run glmnet with LASSO. It deviates quite a bit such that each time I generate an ROC curve for my validation cohort, I get AUC values which deviate a bit. Does anyone know why there is such a fluctuation in the generation of an optimal lambda? I am thinking it might be due to the 10 fold cross validation step the training set is not being split well know to have enough alive and dead cases? Thoughts? -- View this message in context: http://r.789695.n4.nabble.com/glmnet-with-binary-logistic-regression-tp3688126p3688126.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.