Hi, On Sat, 14 Aug 2004, Peyuco Porras Porras . wrote:
> Hi; > Does anyone know how to create a calibration and validation set from a particular > dataset? I have a dataframe with nearly 20,000 rows! and I would like to select > (randomly) a subset from the original dataset (...I found how to do that) to use as > calibration set. However, I don't know how to remove this "calibration" set from the > original dataframe in order to get my "validation" set.....Any hint will be greatly > appreciated. A really quick way, suppose you want to have 30% of your dataset as the validation set: > iris.id = sample(nrow(iris), nrow(iris) * 0.3) > iris.valid = iris[iris.id, ] > iris.train = iris[-iris.id, ] > nrow(iris.valid) [1] 45 > nrow(iris.train) [1] 105 The first line takes a sample of 30% of the number of rows in the Iris data. The second line does a subetting of those samples -- the validation set. The third takes what's left -- the training set. This is perhaps not efficient and the code can definitely be simplified...but it's Sunday morning and I haven't had my morning coffee yet :D Cheers, Kevin -------------------------------- Ko-Kang Kevin Wang PhD Student Centre for Mathematics and its Applications Building 27, Room 1004 Mathematical Sciences Institute (MSI) Australian National University Canberra, ACT 0200 Australia Homepage: http://wwwmaths.anu.edu.au/~wangk/ Ph (W): +61-2-6125-2431 Ph (H): +61-2-6125-7407 Ph (M): +61-40-451-8301 ______________________________________________ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
