Re: [scikit-learn] Generalized Discriminant Analysis with Kernel

2017-01-10 Thread Valery Anisimovsky via scikit-learn
Hi Raga, You may try approximating your kernel using Nystroem kernel approximator (kernel_approximation.Nystroem) and then apply LDA to the transformed feature vectors. If you choose dimensionality of the target space (n_components) large enough (depending on your kernel and data), Nystroem a

Re: [scikit-learn] meta-estimator for multiple MLPRegressor

2017-01-10 Thread Thomas Evangelidis
Jacob, The features are not 6000. I train 2 MLPRegressors from two types of data, both refer to the same dataset (35 molecules in total) but each one contains different type of information. The first data consist of 60 features. I tried 100 different random states and measured the average |R| usin

Re: [scikit-learn] Generalized Discriminant Analysis with Kernel

2017-01-10 Thread Raga Markely
Thank you very much for your info on Nystroem kernel approximator. I appreciate it! Best, Raga On Tue, Jan 10, 2017 at 7:47 AM, wrote: > Send scikit-learn mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > https:

[scikit-learn] Specify boosting percentage using Randomoversampling?

2017-01-10 Thread Suranga Kasthurirathne
Hi all, I apologize - i've been looking for this answer all over the internet, and it could be that I'm not googling the right terms. For managing unbalanced datasets, Weka has SMOTE, and scikit has randomoversampling. In weka, we can ask it to boost by a given percentage (say 100%) so an unders

Re: [scikit-learn] Specify boosting percentage using Randomoversampling?

2017-01-10 Thread Michael Eickenberg
Is maybe this contrib what you are looking for? Take a close look to see whether it does what you expect. http://contrib.scikit-learn.org/imbalanced-learn/auto_examples/over-sampling/plot_smote.html On Tue, Jan 10, 2017 at 6:36 PM, Suranga Kasthurirathne < [email protected]> wrote: > > Hi a

Re: [scikit-learn] Specify boosting percentage using Randomoversampling?

2017-01-10 Thread Guillaume LemaƮtre
I will first assume that RandomOverSampling refer to imbalanced-learn API (a scikit-learn-contrib project). The parameter that you are seeking for is the ratio parameter. By default ratio='auto' which will balance the classes, as you described. The ratio can be given as a float as the ratio of th

Re: [scikit-learn] Specify boosting percentage using Randomoversampling?

2017-01-10 Thread Suranga Kasthurirathne
Well actually, i'm able to answer this myself. Its the ratio attribute (see: http://contrib.scikit-learn.org/imbalanced-learn/generated/imblearn.over_sampling.RandomOverSampler.html ) :) :) On Tue, Jan 10, 2017 at 12:36 PM, Suranga Kasthurirathne < [email protected]> wrote: > > Hi all, > >

Re: [scikit-learn] meta-estimator for multiple MLPRegressor

2017-01-10 Thread Stuart Reynolds
Thomas, Jacob's point is important -- its not the number of features that's important, its the number of free parameters. As the number of free parameters increases, the space of representable functions grows to the point where the cost function is minimized by having a single parameter explain eac

Re: [scikit-learn] meta-estimator for multiple MLPRegressor

2017-01-10 Thread Thomas Evangelidis
Stuart, I didn't see LASSO performing well, especially with the second type of data. The alpha parameter probably needs adjustment with LassoCV. I don't know if you have read my previous messages on this thread, so I quote again my setting for MLPRegressor. MLPRegressor(random_state=random_state