Thanks Thomas ! I am trying to draw random sample from a household survey which has 80,000 observations. rural is name of the dataset, while iwt is survey weights assigned to each observation. the resulting error are :
> z=sample(rural,5000,replace=TRUE, Prob=rural$iwt) Error in sample(rural, 5000, replace = TRUE, Prob = rural$iwt) : unused argument(s) (Prob = c(133, 133, 166, 166, 166, 166, 1047, 1047, 1047, 1047, 288, 623, 623, 240, 240, 432, 144, 144, 719, 719, 316, 342, 342, 816, 816, 105, 158, 158, 1105, 1105, 101, 557, 557, 405, 405, 101, 304, 304, 1165, 1165, 193, 771, 771, 1060, 1060, 482, 530, 530, 2024, 2024, 254, 254, 241, 241, 241, 241, 674, 674, 674, 674, 137, 137, 623, 623, 623, 623, 603, 603, 603, 603, 285, 556, 556, 970, 970, 285, 728, 728, 499, 499, 272, 1349, 1349, 218, 218, 272, 1240, 1240, 95, 95, 307, 307, 307, 307, 307, > iwt=rural[,"iwt"] > z=sample(rural,5000,replace=TRUE, Prob=iwt) Error in sample(rural, 5000, replace = TRUE, Prob = iwt) : unused argument(s) (Prob = c(133, 133, 166, 166, 166, 166, 1047, 1047, 1047, 1047, 288, 623, 623, 240, 240, 432, 144, 144, 719, 719, 316, 342, 342, 816, 816, 105, 158, 158, 1105, 1105, 101, 557, 557, 405, 405, 101, 304, 304, 1165, 1165, 193, 771, 771, 1060, 1060, 482, 530, 530, 2024, 2024, 254, 254, 241, 241, 241, 241, 674, 674, 674, 674, 137, 137, 623, 623, 623, 623, 603, 603, 603, 603, 285, 556, 556, 970, 970, 285, 728, 728, 499, 499, 272, 1349, 1349, 218, 218, 272, 1240, 1240, 95, 95, 307, 307, 307, 307, 307, > iwt=as.vector(rural[,"iwt"]) > z=sample(rural,5000,replace=TRUE, Prob=iwt) Error in sample(rural, 5000, replace = TRUE, Prob = iwt) : unused argument(s) (Prob = c(133, 133, 166, 166, 166, 166, 1047, 1047, 1047, 1047, 288, 623, 623, 240, 240, 432, 144, 144, 719, 719, 316, 342, 342, 816, 816, 105, 158, 158, 1105, 1105, 101, 557, 557, 405, 405, 101, 304, 304, 1165, 1165, 193, 771, 771, 1060, 1060, 482, 530, 530, 2024, 2024, 254, 254, 241, 241, 241, 241, 674, 674, 674, 674, 137, 137, 623, 623, 623, 623, 603, 603, 603, 603, 285, 556, 556, 970, 970, 285, 728, 728, 499, 499, 272, 1349, 1349, 218, 218, 272, 1240, 1240, 95, 95, 307, 307, 307, 307, 307, summary(rural$iwt) Min. 1st Qu. Median Mean 3rd Qu. Max. 1 400 1078 1894 2981 54320 > I just want that random sample look as close as possible to population ( weighted proportions generated from sample) I thought it should automatically normalize probablity vector.I am not sure, i am reading this right // I might be totally off the track. Regards, Mehtab -----Original Message----- From: Thomas Lumley [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 31, 2007 9:44 AM To: Azam, Mehtabul Cc: [EMAIL PROTECTED] Subject: Re: [R] survey weights in sample with replacement On Tue, 30 Oct 2007, Azam, Mehtabul wrote: >>> Hi, > I am trying to draw a random sample from an household survey with > sample weight. Is there any function in R or Splus which allows this. > It depends on exactly what you want. The sample() function will draw unequal probability samples with replacement. sample() will also draw samples without replacement, but (as documented) it uses sequential sampling and so does not actually generate probabilities proportional to the specified weights for sample sizes greater than 1. The error in sequential sampling is pretty small, but it has attracted a lot of creativity in the survey literature (probably more than it deserves). The 'sampling' package implements several algorithms for drawing unequal probability samples without replacement that really are proportional to the specified weights where this is achievable. -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.