[scikit-learn] Changes in the classifier

2020-01-08 Thread aditya aggarwal
Hello I'm trying to change the entropy function which is used in sklearn for DecisionTreeClassification locally on my system. when I rerun the pip install --editable . command after updating the cython file, I receive the following error message: Error compiling Cython file:

Re: [scikit-learn] Why ridge regression can solve multicollinearity?

2020-01-08 Thread Brown J.B. via scikit-learn
Just for convenience: Marquardt, Donald W., and Ronald D. Snee. "Ridge regression in practice." *The > American Statistician* 29, no. 1 (1975): 3-20. > https://amstat.tandfonline.com/doi/abs/10.1080/00031305.1975.10479105 ___ scikit-learn mailing list s

Re: [scikit-learn] Why ridge regression can solve multicollinearity?

2020-01-08 Thread josef . pktd
On Wed, Jan 8, 2020 at 9:43 PM wrote: > > > On Wed, Jan 8, 2020 at 9:38 PM lampahome wrote: > >> >> >> Stuart Reynolds 於 2020年1月9日 週四 上午10:33寫道: >> >>> Correlated features typically have the property that they are tending to >>> be similarly predictive of the outcome. >>> >>> L1 and L2 are both

Re: [scikit-learn] Why ridge regression can solve multicollinearity?

2020-01-08 Thread josef . pktd
On Wed, Jan 8, 2020 at 9:38 PM lampahome wrote: > > > Stuart Reynolds 於 2020年1月9日 週四 上午10:33寫道: > >> Correlated features typically have the property that they are tending to >> be similarly predictive of the outcome. >> >> L1 and L2 are both a preference for low coefficients. >> If a coefficient

Re: [scikit-learn] Why ridge regression can solve multicollinearity?

2020-01-08 Thread lampahome
Stuart Reynolds 於 2020年1月9日 週四 上午10:33寫道: > Correlated features typically have the property that they are tending to > be similarly predictive of the outcome. > > L1 and L2 are both a preference for low coefficients. > If a coefficient can be reduced yet another coefficient maintains similar > lo

Re: [scikit-learn] Why ridge regression can solve multicollinearity?

2020-01-08 Thread Stuart Reynolds
Correlated features typically have the property that they are tending to be similarly predictive of the outcome. L1 and L2 are both a preference for low coefficients. If a coefficient can be reduced yet another coefficient maintains similar loss, the these regularization methods prefer this soluti

[scikit-learn] Why ridge regression can solve multicollinearity?

2020-01-08 Thread lampahome
I find out many blogs said that the l2 regularization solve multicollinearity, but they don't said how it works. I thought LASSO is able to select features by l1 regularization, maybe it also can solve this. Can anyone tell me how ridge works with multicollinearity great? thx ___

Re: [scikit-learn] logistic regression results are not stable between solvers

2020-01-08 Thread Andreas Mueller
Hi Ben. Liblinear and l-bfgs might both converge but to different solutions, given that the intercept is penalized. There is also problems with ill-conditioned problems that are hard to detect. My impression of SAGA was that the convergence checks are too loose and we should improve them. Have

Re: [scikit-learn] logistic regression results are not stable between solvers

2020-01-08 Thread Benoît Presles
With lbfgs n_iter_ = 48, with saga n_iter_ = 326581, with liblinear n_iter_ = 64. On 08/01/2020 21:18, Guillaume Lemaître wrote: We issue convergence warning. Can you check n_iter to be sure that you did not convergence to the stated convergence? On Wed, 8 Jan 2020 at 20:53, Benoît Presles

Re: [scikit-learn] logistic regression results are not stable between solvers

2020-01-08 Thread Guillaume Lemaître
We issue convergence warning. Can you check n_iter to be sure that you did not convergence to the stated convergence? On Wed, 8 Jan 2020 at 20:53, Benoît Presles wrote: > Dear sklearn users, > > I still have some issues concerning logistic regression. > I did compare on the same data (simulated

Re: [scikit-learn] logistic regression results are not stable between solvers

2020-01-08 Thread Benoît Presles
Dear sklearn users, I still have some issues concerning logistic regression. I did compare on the same data (simulated data) sklearn with three different solvers (lbfgs, saga, liblinear) and statsmodels. When everything goes well, I get the same results between lbfgs, saga, liblinear and stat