Re: [scikit-learn] Use of Scaler with LassoCV, RidgeCV

2016-09-13 Thread Brenet, Yoann
Hi Andreas, Thanks a lot for the information. Yoann Date: Tue, 13 Sep 2016 11:56:45 -0400 From: Andreas Mueller To: Scikit-learn user and developer mailing list Subject: Re: [scikit-learn] Use of Scaler with LassoCV, RidgeCV Message-ID: <17283c24-44b3-4f52-3b7b-7824f1a4c...@gmail.com>

Re: [scikit-learn] Use of Scaler with LassoCV, RidgeCV

2016-09-13 Thread Andreas Mueller
It's here (and it's old and probably out of date): https://github.com/scikit-learn/scikit-learn/issues/1626 On 09/13/2016 08:45 AM, Brenet, Yoann wrote: Hi Sebastian, Many thanks, that's what I was thinking I should be doing, so thanks a lot for confirming that was the way to go. Really appre

Re: [scikit-learn] Use of Scaler with LassoCV, RidgeCV

2016-09-13 Thread Andreas Mueller
There is no way to use the "efficient" EstimatorCV objects with pipelines. This is an API bug and there's an open issue and maybe even a PR for that. On 09/13/2016 08:45 AM, Brenet, Yoann wrote: Hi Sebastian, Many thanks, that's what I was thinking I should be doing, so thanks a lot for confir

Re: [scikit-learn] Problems with plotting decision regions

2016-09-13 Thread Sebastian Raschka
Thanks a lot, Jake, ‘viridis’ seems to work, indeed. I guess I should move this to the matplotlib bug tracker then. Best, Sebastian > On Sep 13, 2016, at 10:58 AM, Jacob Vanderplas > wrote: > > It seems to work correctly if you replace the colormap with a continuous one > like 'viridis'. I s

Re: [scikit-learn] Problems with plotting decision regions

2016-09-13 Thread Jacob Vanderplas
It seems to work correctly if you replace the colormap with a continuous one like 'viridis'. I suspect this is a bug in matplotlib's ListedColormap, Jake Jake VanderPlas Senior Data Science Fellow Director of Research in Physical Sciences University of Washington eScience Institute On Tue,

[scikit-learn] Problems with plotting decision regions

2016-09-13 Thread Sebastian Raschka
Hi, all, I am having some problems with showing decision regions in 2D if more than 4 classes are present. Really can’t figure out the source of the problem and would really appreciate some help if you have done this before or have any pointer since I am afraid that I am overlooking something re

Re: [scikit-learn] is RandomForest random samples or random features?

2016-09-13 Thread 斌洪
thanks to all of you. I think I have got the point. ^_^ 2016-09-13 20:30 GMT+08:00 Dale T Smith : > Wrong! Apologies, I had a double loop in there. > > > > Get a random sample of the training data > > For I to n_estimators: > > Build a tree – this involves a *random sample of fea

Re: [scikit-learn] Use of Scaler with LassoCV, RidgeCV

2016-09-13 Thread Brenet, Yoann
Hi Sebastian, Many thanks, that's what I was thinking I should be doing, so thanks a lot for confirming that was the way to go. Really appreciate the help, Yoann Date: Tue, 13 Sep 2016 08:33:52 -0400 From: Sebastian Raschka To: Scikit-learn user and developer mailing list Subject: Re

Re: [scikit-learn] Use of Scaler with LassoCV, RidgeCV

2016-09-13 Thread Sebastian Raschka
Hi, Yoann, when I understand correctly, you want to apply the scaling to each iteration in cross-validation (i.e., the recommended way to do it)? Here, you could use the make_pipeline function, which will call fit on each training fold and call transform on each test fold: from sklearn.prepro

Re: [scikit-learn] is RandomForest random samples or random features?

2016-09-13 Thread Dale T Smith
Wrong! Apologies, I had a double loop in there. Get a random sample of the training data For I to n_estimators: Build a tree – this involves a random sample of features and thresholds for each feature in the training data sample at each node. Use the rest of the tr

Re: [scikit-learn] Use of Scaler with LassoCV, RidgeCV

2016-09-13 Thread Dale T Smith
Hmm. I would scale the training data, and then use the same scaling on the test and validation data. This isn’t quite what you asked, but it’s close and does involve transformations and pipelines. Perhaps you can modify according to your use case, introducing the scaling before PolynomialFeature

Re: [scikit-learn] is RandomForest random samples or random features?

2016-09-13 Thread Dale T Smith
Each tree is built using a random sample with replacement from the provided training data. The data not in the sample is used to calculate the out-of-bag score. The “bag” is the sampled data. The “random” refers to several features of the algorithm, including random sampling of features So for

[scikit-learn] Use of Scaler with LassoCV, RidgeCV

2016-09-13 Thread Brenet, Yoann
Hi all, I was trying to use scikit-learn LassoCV/RidgeCV while applying a 'StandardScaler' on each fold set. I do not want to apply the scaler before the cross-validation to avoid leakage but I cannot figure out how I am supposed to do that with LassoCV/RidgeCV. Is there a way to do this ? Or

Re: [scikit-learn] is RandomForest random samples or random features?

2016-09-13 Thread Nicolas Drougard
You may want to use the parameter called "max_features". Indeed: "1.11.2.3. Parameters -- The main parameters to adjust when using these methods is n_estimators and max_features. The former is the number of trees in the forest. The larger the better, but also the longer it will take to compute. I

[scikit-learn] is RandomForest random samples or random features?

2016-09-13 Thread 斌洪
I have read the Guide of sklearn's RandomForest : """ In random forests (see RandomForestClassifier and RandomForestRegressor