[scikit-learn] Why the default max_samples of Random Forest is X.shape[0]?

Fernando Marcos Wittmann Fri, 08 May 2020 14:07:25 -0700

When reading the documentation of Random Forest, I got the following:
```
max_samples : int or float, default=None If bootstrap is True, the number
of samples to draw from X to train each base estimator. - *If None
(default), then draw `X.shape[0]` samples.* - If int, then draw
`max_samples` samples. - If float, then draw `max_samples * X.shape[0]`
samples. Thus, `max_samples` should be in the interval `(0, 1)`.
```


Why does the whole dataset (i.e. X.shape[0] samples from X) is used to
build each tree? That would be equivalent to bootstrap to be False, right?
Wouldn't it be better practices to use as default 2/3 of the size of the
dataset?

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

[scikit-learn] Why the default max_samples of Random Forest is X.shape[0]?

Reply via email to