Ohh, I can see now my mistake after reviewing the concept of bootstrapping
and sampling with replacement. I was assuming that the "replacement" was
made only after finishing each tree (i.e. If I was samping 2/3 of data, the
very same data could be selected again for each tree, but no element would
be repeated in a given tree). My apologies. Everything makes sense again

On Sun, May 10, 2020, 19:42 Fernando Marcos Wittmann <
fernando.wittm...@gmail.com> wrote:

> Okay, so it's sampling with replacement with same size of the original
> dataset. That mean that some of the samples would be repeated for each tree
>
> On Sun, May 10, 2020, 19:40 Fernando Marcos Wittmann <
> fernando.wittm...@gmail.com> wrote:
>
>> My question is why the full dataset is being used as default when
>> building each tree. That's not random forest. The main point of RF is to
>> build each tree with a subsample of the full dataset
>>
>> On Sun, May 10, 2020, 09:50 Joel Nothman <joel.noth...@gmail.com> wrote:
>>
>>> A bootstrap is very commonly a random draw with replacement of equal
>>> size to the original sample.
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn@python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>>
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to