[scikit-learn] Difference in normalization between Lasso and LogisticRegression + L1

Jesse Livezey Wed, 29 May 2019 10:37:50 -0700

Hi everyone,

I noticed recently that in the Lasso implementation (and docs), the MSE
term is normalized by the number of samples
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html


but for LogisticRegression + L1, the logloss does not seem to be normalized
by the number of samples. One consequence is that the strength of the
regularization depends on the number of samples explicitly. For instance,
in Lasso, if you tile a dataset N times, you will learn the same coef, but
in LogisticRegression, you will learn a different coef.

Is this the intended behavior of LogisticRegression? I was surprised by
this. Either way, it would be helpful to document this more clearly in the
Logistic Regression docs (I can make a PR.)
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

Jesse

_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

[scikit-learn] Difference in normalization between Lasso and LogisticRegression + L1

Reply via email to