subject:"\[R\] Re \: Running random forest using different training and testing schemes"

[R] Re : Running random forest using different training and testing schemes

2009-04-12 Thread Pierre Moffard

Hi Chysanthi,

check out the randomForest package, with the function randomForest. It has a CV 
option. Sorry for not providing you with a lengthier response at the moment but 
I'm rather busy on a project. Let me know if you need more help.

Also, to split your data into two parts- the training and the test set you can 
do (n the number of data points):
n-length(data[,1])
indices-sample(rep(c(TRUE,FALSE),each=n/2),round(n/2),replace=TRUE)
training_indices-(1:n)[indices]
test_indices-(1:n)[!indices]

Then, data[train,] is the training set and data[test,] is the test set.

Best,
Pierre



De : Chrysanthi A. chrys...@gmail.com
À : r-help@r-project.org
Envoyé le : Dimanche, 12 Avril 2009, 17h26mn 59s
Objet : [R] Running random forest using different training and testing schemes

Hi,

I would like to run random Forest classification algorithm and check the
accuracy of the prediction according to different training and testing
schemes. For example, extracting 70% of the samples for training and the
rest for testing, or using 10-fold cross validation scheme.
How can I do that? Is there a function?

Thanks a lot,

Chrysanthi.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Re : Running random forest using different training and testing schemes

2009-04-12 Thread Pierre Moffard



you need to include in your code something like:

tree-rpart(result~., data, control=rpart.control(xval=10)).

this xval=10 is 10-fold CV.

Best,
Pierre



De : Chrysanthi A. chrys...@gmail.com
À : r-help@r-project.org
Envoyé le : Dimanche, 12 Avril 2009, 17h26mn 59s
Objet : [R] Running random forest using different training and testing schemes

Hi,

I would like to run random Forest classification algorithm and check the
accuracy of the prediction according to different training and testing
schemes. For example, extracting 70% of the samples for training and the
rest for testing, or using 10-fold cross validation scheme.
How can I do that? Is there a function?

Thanks a lot,

Chrysanthi.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Re : Running random forest using different training and testing schemes

2009-04-12 Thread Chrysanthi A.

Hi Pierre,

Thanks a lot for your help..
So, using that script, I just separate my data in two parts, right? For
using as training set the 70 % of the data and the rest as test, should I
multiply the n with the 0.70 (for this case)?

Many thanks,

Chrysanthi



2009/4/12 Pierre Moffard pier.m...@yahoo.fr

 Hi Chysanthi,

 check out the randomForest package, with the function randomForest. It has
 a CV option. Sorry for not providing you with a lengthier response at the
 moment but I'm rather busy on a project. Let me know if you need more help.

 Also, to split your data into two parts- the training and the test set you
 can do (n the number of data points):
 n-length(data[,1])
 indices-sample(rep(c(TRUE,FALSE),each=n/2),round(n/2),replace=TRUE)
 training_indices-(1:n)[indices]
 test_indices-(1:n)[!indices]
 Then, data[train,] is the training set and data[test,] is the test set.

 Best,
 Pierre
 --
 *De :* Chrysanthi A. chrys...@gmail.com
 *À :* r-h...@r-project..org
 *Envoyé le :* Dimanche, 12 Avril 2009, 17h26mn 59s
 *Objet :* [R] Running random forest using different training and testing
 schemes

 Hi,

 I would like to run random Forest classification algorithm and check the
 accuracy of the prediction according to different training and testing
 schemes. For example, extracting 70% of the samples for training and the
 rest for testing, or using 10-fold cross validation scheme.
 How can I do that? Is there a function?

 Thanks a lot,

 Chrysanthi.

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Re : Running random forest using different training and testing schemes

[R] Re : Running random forest using different training and testing schemes

Re: [R] Re : Running random forest using different training and testing schemes

3 matches

Site Navigation

Mail list logo

Footer information