Maybe some of the tree huggers can say something about that ;) Below are
my best guess.
I am surprised to see that the docs say no regularization is usually best.
I would not use such large upper bounds as you did, and I would never
search the full range, but rather steps to get only a few cand
Hi Satra,
In my experience, adjusting max_features can make some difference (I work with
image data).
Cheers,
Michal
--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Statu
thanks andy.
are there any general heuristics for these parameters - given that their
ranges are over the samples?
max_depth = range(1, nsamples)
or
min_samples_leaves = range(1, nsamples)
also related question: given that nsamples would actually depend on the cv
method of the GridSearchCV, is t
Hi Satra.
You should set "n_estimators" as high as you can afford time and memory
wise, and then cross-validate over (at least) one of the regularization
parameters,
for example over max_depth or min_samples_leaves. You can also search
over max_features.
Cheers,
Andy
On 09/26/2014 10:24 PM,
hi folks,
what are some useful ranges of parameters to throw into a grid search? and
are there specific difference between randomforests and extra trees? i
understand one could try different impurity measures for classification,
but any suggestions on sensitivity of other parameters would be nice.
On 09/23/2014 11:50 PM, Pagliari, Roberto wrote:
I’m a bit confused as to why gridsearchCV is not needed with random
forests. I understand that with RF, each tree will only get to see a
partial representation of the data.
Why do you say GridSearchCV is not needed?
I think it should always b
You can indeed tune parameters of the RF with grid search, and the score
method will be used although you could specify a different task metric to
GridSearchCV's scoring parameter.
On 24 September 2014 07:50, Pagliari, Roberto
wrote:
> I’m a bit confused as to why gridsearchCV is not needed with
I'm a bit confused as to why gridsearchCV is not needed with random forests. I
understand that with RF, each tree will only get to see a partial
representation of the data.
However, if I wanted to tune some parameters of the RF, wouldn't I still need
to do gridsearch? If that is the case, does