Hi Steve,

Thanks for the note. I did try the example and the result didn't make sense to me. For splitting a vector, what you describe is a big difference btw them. For splitting a dataframe, I now wonder if these 2 functions are the wrong choices. They seem to split the columns, at least in the few things I tried.

Bonnie

Quoting Steve Lianoglou <mailinglist.honey...@gmail.com>:

Hi,

On Sun, Oct 2, 2011 at 2:47 PM,  <bby2...@columbia.edu> wrote:
Hello,

I'm trying to separate my dataset into 4 parts with the 4th one as the test
dataset, and the other three to fit a model.

I've been searching for the difference between these 2 functions in Caret
package, but the most I can get is this--

A series of test/training partitions are created using createDataPartition
while createResample creates one or more bootstrap samples. createFolds
splits the data into k groups.

I'm missing something here? What is the difference btw createPartition and
createFold? I guess they wouldn't be equivalent.

Well -- you could always look at the source code to find out (enter
the name of the function into your R console and hit return), but you
can also do some experimentation to find out. Using the data from the
Examples section of caret::createFolds:

R> library(caret)
R> data(oil)
R> part <- createDataPartition(oilType, 2)
R> fold <- createFolds(oilType, 2)

R> length(Reduce(intersect, part))
[1] 27

R> length(Reduce(intersect, fold))
[1] 0

Looks like `createDataPartition` split your data into smaller pieces,
but allows for the same example to appear in different splits.

`createFolds` doesn't allow different examples to appear in different
splits of the folds.

HTH,
-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to