Ohh, I can see now my mistake after reviewing the concept of bootstrapping and sampling with replacement. I was assuming that the "replacement" was made only after finishing each tree (i.e. If I was samping 2/3 of data, the very same data could be selected again for each tree, but no element would be repeated in a given tree). My apologies. Everything makes sense again
On Sun, May 10, 2020, 19:42 Fernando Marcos Wittmann < fernando.wittm...@gmail.com> wrote: > Okay, so it's sampling with replacement with same size of the original > dataset. That mean that some of the samples would be repeated for each tree > > On Sun, May 10, 2020, 19:40 Fernando Marcos Wittmann < > fernando.wittm...@gmail.com> wrote: > >> My question is why the full dataset is being used as default when >> building each tree. That's not random forest. The main point of RF is to >> build each tree with a subsample of the full dataset >> >> On Sun, May 10, 2020, 09:50 Joel Nothman <joel.noth...@gmail.com> wrote: >> >>> A bootstrap is very commonly a random draw with replacement of equal >>> size to the original sample. >>> _______________________________________________ >>> scikit-learn mailing list >>> scikit-learn@python.org >>> https://mail.python.org/mailman/listinfo/scikit-learn >>> >>
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn