The function rpart may well overfit if the value of the CP statistic
is left at its default.  Use the functions printcp() and plotcP() to
check how the cross-validation estimate of relative ‘error’ (xerror)
changes with the number of splits (NB that the CP that leads to a
further split changes monotonically with the number of splits).
The ‘rel error’ column from printcp() can be hopelessly optimistic.


John Maindonald             email: 
john.maindon...@anu.edu.au<mailto:john.maindon...@anu.edu.au>

phone : +61 2 (6125)3473    fax  : +61 2(6125)5549

Centre for Mathematics & Its Applications, Room 1194,

John Dedman Mathematical Sciences Building (Building 27)

Australian National University, Canberra ACT 0200.


On 8/04/2014, at 8:00 pm, 
r-help-requ...@r-project.org<mailto:r-help-requ...@r-project.org> wrote:

From: r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org> 
[mailto:r-help-boun...@r-project.org] On Behalf Of Schillo, Sonja
Sent: Thursday, April 03, 2014 3:58 PM
To: Mitchell Maltenfort
Cc: r-help@r-project.org<mailto:r-help@r-project.org>
Subject: Re: [R] rpart and randomforest results

Hi,

the random forest should do that, you're totally right. As far as I know it 
does so by randomly selecting the variables considered for a split (but here we 
set the option for how many variables to consider at each split to the number 
of variables available so that I thought that the random forest does not have 
the chance to randomly select the variables). The next thing that randomforest 
does is bootstrapping. But here again we set the option to the number of cases 
we have in the data set so that no bootstrapping should be done.
We tried to take all the "randomness" from the randomforest away.

Is that plausible and does anyone have another idea?

Thanks
Sonja


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to