Re: [scikit-learn] Question about Python's L2-Regularized Logistic Regression

Michael Eickenberg Thu, 29 Sep 2016 15:24:46 -0700

That should totally depend on your dataset. Maybe it is an "easy" dataset
and not much regularization is needed.


Maybe use PCA(n_components=2) or an LDA transform to take a look at your
data in 2D. Maybe they are easily linearly separable?

Sklearn does not do any feature selection if you don't ask it to.

What C-values are you using? Try an np.logspace but go much farther out
both sides than you think reasonable. Then plot AUC as a function of that
to get a global idea of what is going on.

hth,
Michael

On Friday, September 30, 2016, Kristen M. Altenburger <kalt...@stanford.edu>
wrote:

> Hi All,
>
> I am trying to understand Python’s code [function ‘_fit_liblinear' in
> https://github.com/scikit-learn/scikit-learn/blob/
> master/sklearn/svm/base.py] for fitting an L2-logistic regression for a
> ‘liblinear’ solver. More specifically, my [approximately balanced class]
> dataset is such that the # of predictors [p=2000] >> # of observations
> [n=100]. Therefore, I am currently confused that when I increase C [and
> thus decrease the regularization strength] in fitting the logistic
> regression model to my training data why I then still obtain high AUC
> results when the model is then applied to my testing data. Is python
> internally doing a feature selection when fitting this model for high C
> values? Or why is it that the almost unregularized model [high C values]
> versus regularized [cross-validated approach to selecting C] model both
> result in similar AUC and accuracy results when the model is applied to the
> testing data? Should I be coding my predictors as +1/-1?
>
> Any pointers/explanations would be much appreciated!
>
> Thanks,
> Kristen
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org <javascript:;>
> https://mail.python.org/mailman/listinfo/scikit-learn
>

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Question about Python's L2-Regularized Logistic Regression

Reply via email to