I'm trying to learn how to calibrate/postStratify/rake survey data in preparation for a large survey effort we're about to embark upon. As a working example, I have results from a small survey of ~650 respondents, ~90 response fields each. I'm trying to learn how to (properly?) apply the aforementioned functions.
My data are from a bus on board survey. The expansion in the dataset is derived from three elements: Response rates by bus stop for a sampled run Total runs/samples runs Normalized to (separately derived) daily line boarding In order to get to the point of raking the data, I need to learn more about the survey package and nomenclature. For instance, given how I've described the survey/weighting, is my call to svydesign correct? I'm not sure I understand just what a "survey design" is. Where can I read up on this? What's a good reference for such things as "PSUs", "cluster sampling", and so on. I've tried the following code, which fails: > SurveyData <- read.spss("C:/Data/R/orange_delivery.sav", use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE) > #======================================================================= ======== > temp <- sub(' +$', '', SurveyData$direction_) > SurveyData$direction_ <- temp > #======================================================================= ======== > SurveyData$NumStn=abs(as.numeric(SurveyData$lineon)-as.numeric(SurveyDat a$lineoff)) > EBSurvey <- subset(SurveyData, direction_ == "EASTBOUND" ) > XTTable <- xtabs(~direction_ , EBSurvey) > XTTable direction_ EASTBOUND 345 > WBSurvey <- subset(SurveyData, direction_ == "WESTBOUND" ) > XTTable <- xtabs(~direction_ , WBSurvey) > XTTable direction_ WESTBOUND 307 > # > EBDesign <- svydesign(id=~sampn, weights=~expwgt, data=EBSurvey) > # svytable(~lineon+lineoff, EBDesign) > OnLabels <- c( "Warner Center", "De Soto", "Pierce College", "Tampa", "Reseda", "Balboa", "Woodley", "Sepulveda", "Van Nuys", "Woodman", "Valley College", "Laurel Canyon", "North Hollywood") > EBOnNewTots <- c( 1000, 600, 1200, 500, 1000, 500, 200, 250, 1000, 300, 100, 50, 73.65 ) > EBNumStn <- c(673.65, 800, 1000, 1000, 800, 700, 600, 500, 400, 200, 50, 50 ) > ByEBOn <- data.frame(OnLabels,EBOnNewTots) > ByEBNum <- data.frame(c(1:12),EBNumStn) > RakedEBSurvey <- rake(EBDesign, list(~ByEBOn, ~ByEBNum), list(EBOnNewTots, EBNumStn ) ) Error in model.frame.default(margin, data = design$variables) : invalid type (list) for variable 'ByEBOn' > Robert Farley Metro 1 Gateway Plaza Mail Stop 99-23-7 Los Angeles, CA 90012-2952 Voice: (213)922-2532 Fax: (213)922-2868 www.Metro.net [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.