Re: [R] calibration/validation sets

Kevin Wang Sat, 14 Aug 2004 17:53:17 -0700

Hi,

On Sat, 14 Aug 2004, Peyuco Porras Porras . wrote:


> Hi;
> Does anyone know how to create a calibration and validation set from a particular 
> dataset? I have a dataframe with nearly 20,000 rows! and I would like to select 
> (randomly) a subset from the original dataset (...I found how to do that) to use as 
> calibration set. However, I don't know how to remove this "calibration" set from the 
> original dataframe in order to get my "validation" set.....Any hint will be greatly 
> appreciated.

A really quick way, suppose you want to have 30% of your dataset as the
validation set:
> iris.id = sample(nrow(iris), nrow(iris) * 0.3)
> iris.valid = iris[iris.id, ]
> iris.train = iris[-iris.id, ]
> nrow(iris.valid)
[1] 45
> nrow(iris.train)
[1] 105

The first line takes a sample of 30% of the number of rows in the Iris
data.  The second line does a subetting of those samples -- the validation
set.  The third takes what's left -- the training set.  This is perhaps
not efficient and the code can definitely be simplified...but it's Sunday
morning and I haven't had my morning coffee yet :D

Cheers,

Kevin


--------------------------------
Ko-Kang Kevin Wang
PhD Student
Centre for Mathematics and its Applications
Building 27, Room 1004
Mathematical Sciences Institute (MSI)
Australian National University
Canberra, ACT 0200
Australia
Homepage: http://wwwmaths.anu.edu.au/~wangk/
Ph (W): +61-2-6125-2431
Ph (H): +61-2-6125-7407
Ph (M): +61-40-451-8301

______________________________________________
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] calibration/validation sets

Reply via email to