Actually, just to follow up on this, I believe I see what's wrong. I've overlooked the fact that I'm attempting to compare classification from lasso in the R glmnet package to lassocv here, and the linear model that supports classification would be logistic regression in scikitlearn. Sorry! ________________________________ From: scikit-learn [scikit-learn-bounces+ysa=mit....@python.org] on behalf of Yoel Sanchez Araujo [y...@mit.edu] Sent: Friday, September 30, 2016 10:51 AM To: Scikit-learn user and developer mailing list Subject: [scikit-learn] understanding lasso performance ?
Hi all, Today I updated to the latest release of scikit-learn, and I went to test out the LassoCV module in linear_model. I've tried both approaches below, and my accuracy seems very poor, while using the same exact data with glmnet in R for example will give me ~ 75% accuracy: from sklearn import linear_model from sklearn.model_selection import StratifiedKFold, train_test_split lassocv1 = linear_model.LassoCV(cv=10, max_iter=10000, n_alphas=10000) xtrain, xtest, ytrain, ytest = train_test_split( endo_Xv, endo_y, test_size = .25, random_state = 1 ) lassocv1.fit(xtrain, ytrain) lassocv1.score(xtest, ytest) from this, lassocv1.coef_ returns all zero coefficients I've also tried this: k_fold_S = StratifiedKFold(n_splits=10, shuffle=False) lasso_cv = linear_model.LassoCV() alphas=[] scores=[] coefs=[] ks=[] for k, (train, test) in enumerate(k_fold_S.split(endo_Xv, endo_y)): lasso_cv.fit(endo_Xv[train], endo_y[train]) scores.append(lasso_cv.score(endo_Xv[test], endo_y[test])) alphas.append(lasso_cv.alpha_) coefs.append(lasso_cv.coef_) ks.append(k) for all k, the coef_ arrays are all zero and the scores array for example: [-1.3295256159340241e-05, -1.3295256159562285e-05, -1.3295256159784328e-05, -1.3295256159562285e-05, -1.3295256159562285e-05, -1.3295256159340241e-05, -6.4162287406910323e-05, -6.4162287406910323e-05, -6.4162287406910323e-05, -3.8436343168246623e-06]) Any insights would be greatly appreciated, not sure if this has anything to do with the update, but yesterday(unupdated) I was getting better performance.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn