Aldi, Your concept of sample is different from mine. I would expect with replacement to be equivalent for a for loop of sampling without replacement. samples <- 1:400 for (i in 1:400) samples[i] <- sample(c(0,1,2),1 ,prob=c(0.02 ,0.93 ,0.05 )) Sampling without replacement: first : sample(c(0,1,2),1 ,prob=c(0.02 ,0.93 ,0.05 )) second: depending on first (suppose 2 was selected) sample(c(0,1),1 ,prob=c(0.02 ,0.93)/.95) third: whatever is remaining with probability 1.
n.b. the second is equivalent to sample(c(0,1),1 ,prob=c(0.02 ,0.93)), since sample normalized the probabilities itself. Concerning your result: observed <- c(0.0200, 0.9225, 0.0575 )*400 expected <- c(0.02 ,0.93 ,0.05 )*400 stat <- sum((observed-expected)^2/expected) pchisq(stat,2,lower=FALSE) [1] 0.788915 Seems ok to me. Cheers, Kees On Saturday 30 December 2006 16:55, Aldi Kraja wrote: > Partial Summary and discussion: > ===================== > Thank you to Chao Gai, Chuck Cleland, and Jim Lemon for their suggestion > to use replace=T in R. > There is a problem though (see below) > > In the Splus7, sample is defined as > ------------- > sample(x, size = n, replace = F, prob = NULL, n = NULL, ...) where > replace=F > In Splus7 > > xlrmN1 <- sample(c(0,1,2),400 ,prob=c(0.02 ,0.93 ,0.05 )) > > and the > > table(xlrmN1)/400 > 0 1 2 > 0.02 0.93 0.05 > show that "sample" is working exactly as expected based on the prob vector. > > When "sample" is used in Splus7 with replacement we see the following > > result: > > xlrmN1 <- sample(c(0,1,2),400 ,replace=T,prob=c(0.02 ,0.93 ,0.05 )) > > table(xlrmN1)/400 > > 0 1 2 > 0.0125 0.925 0.0625 > which I think is working again as expected. > > In the R, sample is defined as > --------- > > sample(x, size, replace = FALSE, prob = NULL) > > So the above statement with replace=F did not work (reported error) > but with replace=T produced, > > > table(xlrmN1)/400 > > xlrmN1 > 0 1 2 > 0.0200 0.9225 0.0575 > > which is not exactly the sample with the probabilities provided > (0.02,0.93,0.05) > > Now let's return to the concept of replace=F and replace=T. > When I ask "sample" to select a sample of 400 from a vector of 3 with NO > replacement, I would think the following a). create a very large sample > from 0, 1, and 2. b). From this large sample, based on the prob vector > select without replacement. c). As result I expect the probability of > selected sample to be exactly the same with the prob vector (As in Splus7) > > When I ask "sample" to select a sample of 400 from a vector of 3 with > replacement, I would think the following a). create a very large sample > from 0, 1, and 2. b). From this large sample, based on the prob vector > select with replacement, which means some of the previous selected 0, 1, 2 > can be selected again. c). As result I expect the probability of selected > sample to be NOT exactly the same with the prob vector (As in Splus7 and > R). > > So there are two conclusions: "sample" in R is not working correct, OR I am > missing some precision as a rounding error to produce > > prob=c(0.02 ,0.93 ,0.05 ). > Am I misunderstanding the "sample" function in R? > > Any suggestions are appreciated. > TIA, > > Aldi > > Aldi Kraja wrote: > >Hi, > >In Splus7 this statement > >xlrmN1 <- sample(c(0,1,2),400 ,prob=c(0.02 ,0.93 ,0.05 )) > >worked fine, but in R the interpreter reports that the length of the > >vector to chose c(0,1,2) is shorter than the size of many times I want > >to be selected from the vector c(0,1,2). > >Any good reason? > >See below the error. > > > > > xlrmN1 <- sample(c(0,1,2),400 ,prob=c(0.02 ,0.93 ,0.05 )) > > > >Error in sample(length(x), size, replace, prob) : > > cannot take a sample larger than the population > > when 'replace = FALSE' > >Execution halted > > > >TIA, > > > >Aldi > > > >-- > > -- > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, minimal, > self-contained, reproducible code. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.