On 07/03/2011 2:17 PM, Cesar Hincapié wrote:
Hello:

I wonder if I could get a little help with random sampling in R.

I have a vector of length 7375.  I would like to draw 3 distinct random 
samples, each of length 100 without replacement.  I have tried the following:

d1<- 1:7375

set.seed(7)
i<- sample(d1, 100, replace=F)
s1<- sort(d1[i])
s1

d2<- d1[-i]
set.seed(77)
j<- sample(d2, 100, replace=F)
s2<- sort(d2[j])
s2

d3<- d2[-j]
set.seed(777)
k<- sample(d3, 100, replace=F)
s3<- sort(d3[k])
s3

D<- data.frame(a=s1,b=s2,c=s3)


However, s2 is only 97 elements long, and s3, only 96 long.

I would appreciate any suggestions on a better approach.
I'm also curious to know why my second and third samples are less than 100 
elements in length.

If you want 3 non-overlapping, non-repeating samples of 100, why not draw one sample of 300, and take 3 subsets of it?

The reason you were finding shorter samples is because you were using j and k as indices into vectors d2 and d3 that didn't have enough elements, and then you sorted the result, losing the NAs. For example,

d2 <- 1:10
d2[10:12]
sort(d2[10:12])

See ?sort for an explanation of how to keep NA values when you sort.

Duncan Murdoch

Thanks for your time and consideration,

Cesar A. Hincapié, DC, MHSc

Research Fellow, Division of Health Care and Outcomes Research, Toronto Western 
Research Institute
PhD Candidate in Epidemiology, Dalla Lana School of Public Health, University 
of Toronto
e. cesar.hinca...@utoronto.ca





        [[alternative HTML version deleted]]



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to