hi folks,
when using extra trees, one can compute an oob score. has anybody looked at
comparing the oob_score to performing a shufflesplit iteration on the data?
are these in someways equivalent or would converge to the same mean?
cheers,
satra
--
Hi Satrajit,
Adding more trees should never hurt accuracy. The more, the better.
Since you have a lot of irrelevant features, I'll advise to increase
max_features in order to capture the relevant features when computing
the random splits. Otherwise, your trees will indeed fit on noise.
Best,
Gi
thanks paolo, will give all of this a try.
i'll also send a pr with a section on patterns for sklearn. although this
pattern might be specific to my problem domain, having more real-world
scripts/examples that reflect such considerations might be useful to the
community.
cheers,
satra
On Sun, M
Hi Satraijit,
On Sun, Mar 25, 2012 at 3:02 PM, Satrajit Ghosh wrote:
> hi giles,
>
> when dealing with skinny matrices of the type few samples x lots of
> features what are the recommendations for extra trees in terms of max
> features and number of estimators?
as far as number of estimators (t
On Sun, Mar 25, 2012 at 3:32 PM, Paolo Losi wrote:
> You could rank features by feature importance and perform recursive feature
> limitation
s/recursive feature limitation/recursive feature elimination/
--
This SF emai
hi giles,
when dealing with skinny matrices of the type few samples x lots of
features what are the recommendations for extra trees in terms of max
features and number of estimators?
also if a lot of the features are nuisance and most are noisy, are there
any recommendations for feature reductio