Hi everyone,

I noticed recently that in the Lasso implementation (and docs), the MSE
term is normalized by the number of samples
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html

but for LogisticRegression + L1, the logloss does not seem to be normalized
by the number of samples. One consequence is that the strength of the
regularization depends on the number of samples explicitly. For instance,
in Lasso, if you tile a dataset N times, you will learn the same coef, but
in LogisticRegression, you will learn a different coef.

Is this the intended behavior of LogisticRegression? I was surprised by
this. Either way, it would be helpful to document this more clearly in the
Logistic Regression docs (I can make a PR.)
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

Jesse
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to