Hi Raga,
You may try approximating your kernel using Nystroem kernel approximator
(kernel_approximation.Nystroem) and then apply LDA to the transformed
feature vectors. If you choose dimensionality of the target space
(n_components) large enough (depending on your kernel and data),
Nystroem a
Jacob,
The features are not 6000. I train 2 MLPRegressors from two types of data,
both refer to the same dataset (35 molecules in total) but each one
contains different type of information. The first data consist of 60
features. I tried 100 different random states and measured the average |R|
usin
Thank you very much for your info on Nystroem kernel approximator. I
appreciate it!
Best,
Raga
On Tue, Jan 10, 2017 at 7:47 AM, wrote:
> Send scikit-learn mailing list submissions to
> [email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https:
Hi all,
I apologize - i've been looking for this answer all over the internet, and
it could be that I'm not googling the right terms.
For managing unbalanced datasets, Weka has SMOTE, and scikit has
randomoversampling.
In weka, we can ask it to boost by a given percentage (say 100%) so an
unders
Is maybe this contrib what you are looking for? Take a close look to see
whether it does what you expect.
http://contrib.scikit-learn.org/imbalanced-learn/auto_examples/over-sampling/plot_smote.html
On Tue, Jan 10, 2017 at 6:36 PM, Suranga Kasthurirathne <
[email protected]> wrote:
>
> Hi a
I will first assume that RandomOverSampling refer to imbalanced-learn API
(a scikit-learn-contrib project).
The parameter that you are seeking for is the ratio parameter. By default
ratio='auto' which will balance
the classes, as you described.
The ratio can be given as a float as the ratio of th
Well actually, i'm able to answer this myself. Its the ratio attribute
(see:
http://contrib.scikit-learn.org/imbalanced-learn/generated/imblearn.over_sampling.RandomOverSampler.html
)
:) :)
On Tue, Jan 10, 2017 at 12:36 PM, Suranga Kasthurirathne <
[email protected]> wrote:
>
> Hi all,
>
>
Thomas,
Jacob's point is important -- its not the number of features that's
important, its the number of free parameters. As the number of free
parameters increases, the space of representable functions grows to the
point where the cost function is minimized by having a single parameter
explain eac
Stuart,
I didn't see LASSO performing well, especially with the second type of
data. The alpha parameter probably needs adjustment with LassoCV.
I don't know if you have read my previous messages on this thread, so I
quote again my setting for MLPRegressor.
MLPRegressor(random_state=random_state