This problem in sampling::strata() comes from calling cbind on a zero-row data.frame with a scalar number.
> library(sampling) > strata(mtcars[,c("mpg","hp","gear")], strat="gear", size=c(5,5,0)) Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 0, 1 In addition: Warning message: In strata(mtcars[, c("mpg", "hp", "gear")], strat = "gear", size = c(5, : the method is not specified; by default, the method is srswor > traceback() 5: stop("arguments imply differing number of rows: ", paste(unique(nrows), collapse = ", ")) 4: data.frame(..., check.names = FALSE) 3: cbind(deparse.level, ...) 2: cbind(r, i) 1: strata(mtcars[, c("mpg", "hp", "gear")], strat = "gear", size = c(5, 5, 0)) Changing that cbind call from cbind(r, i) to cbind(r, rep(i, length.out=nrow(r))) would fix it up. cbind is not entirely consistent with what it does with a 0-row rectangular input and a scalar. With a matrix you get a 0-row result and a warning > m <- matrix(numeric(), nrow=0, ncol=3, dimnames=list(NULL,paste0("Col",1:3))) > str(cbind(m, 666)) num[0 , 1:4] - attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:4] "Col1" "Col2" "Col3" "" Warning message: In cbind(m, 666) : number of rows of result is not a multiple of vector length (arg 2) With a data.frame you get an error > str(cbind(data.frame(m), 666)) Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 0, 1 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of Thomas Lumley > Sent: Sunday, April 28, 2013 1:31 PM > To: Jeff Newmiller > Cc: R help (r-help@r-project.org) > Subject: Re: [R] Stratified Random Sampling Proportional to Size > > It looks as though you can't sample zero observations from a stratum. If > you take the example on the help page and change one of the sample sizes to > zero you get exactly the same error. > > >From the fact that there isn't a more explicit error message, I would guess > that the author just never considered the possibility that someone would > have a population stratum and not sample from it. > > -thomas > > > On Sun, Apr 28, 2013 at 7:14 PM, Jeff Newmiller > <jdnew...@dcn.davis.ca.us>wrote: > > > a) Please post plain text > > > > b) Please make reproducible examples (e.g. telling us how you accessed a > > database that we have no access to is not helpful). See ?head, ?dput and [1] > > > > c) I don't know anything about the sampling package or the strata > > function, but I would recommend eliminating the rows that have zeros from > > the input data. E.g.: > > > > stratum_cp <- stratum_cp[ 0<stratum_cp$stratp, ] > > > > [1] http://stackoverflow.com/**questions/5963269/how-to-make-** > > a-great-r-reproducible-example<http://stackoverflow.com/questions/5963269/how- > to-make-a-great-r-reproducible-example> > > > > On Fri, 26 Apr 2013, Lopez, Dan wrote: > > > > Hello R Experts, > >> > >> I kindly request your assistance on figuring out how to get a stratified > >> random sampling proportional to 100. > >> > >> Below is my r code showing what I did and the error I'm getting with > >> sampling::strata > >> > >> # FIRST I summarized count of records by the two variables I want to use > >> as strata > >> > >> Library(RODBC) > >> library(sqldf) > >> library(sampling) > >> #After establishing connection I query the data and sort it by strata > >> APPT_TYP_CD_LL and EMPL_TYPE and store it in a dataframe > >> CURRPOP<-sqlQuery(ch,"SELECT APPT_TYP_CD_LL, > EMPL_TYPE,ASOFDATE,EMPLID,** > >> NAME,DEPTID,JOBCODE,JOBTITLE,**SAL_ADMIN_PLAN,RET_TYP_CD_LL FROM > >> PS_EMPLOYEES_LL WHERE EMPL_STATUS NOT IN('R','T') ORDER BY > APPT_TYP_CD_LL, > >> EMPL_TYPE") > >> #ROWID is a dummy ID I added and repositioned after the strat columns for > >> later use > >> CURRPOP$ROWID<-seq(nrow(**CURRPOP)) > >> CURRPOP<-CURRPOP[,c(1:2,11,3:**10)] > >> > >> # My strata. Stratp is how many I want to sampled from each strata. NOTE > >> THERE ARE SOME 0's which just means I won't sample from that group. > >> stratum_cp<-sqldf("SELECT APPT_TYP_CD_LL,EMPL_TYPE, count(*) HC FROM > >> CURRPOP GROUP BY APPT_TYP_CD_LL,EMPL_TYPE") > >> stratum_cp$stratp<-round(**stratum_cp$HC/nrow(CURRPOP)***100) > >> > >> stratum_cp > >>> > >> APPT_TYP_CD_LL EMPL_TYPE HC stratp > >> 1 FA S 1 0 > >> 2 FC S 5 0 > >> 3 FP S 173 3 > >> 4 FR H 170 3 > >> 5 FX H 49 1 > >> 6 FX S 57 1 > >> 7 IN H 1589 25 > >> 8 IN S 3987 63 > >> 9 IP H 7 0 > >> 10 IP S 53 1 > >> 11 SA H 8 0 > >> 12 SE S 43 1 > >> 13 SF H 14 0 > >> 14 SF S 1 0 > >> 15 SG S 10 0 > >> 16 ST H 107 2 > >> 17 ST S 6 0 > >> > >> #THEN I attempted to use sampling::strata using the instructions in that > >> package and got an error > >> > >> > >> #I use stratum_cp$stratp for my sizes. > >> > >> > >> > >> s<-strata(CURRPOP,c("APPT_TYP_**CD_LL","EMPL_TYPE"),size=** > >>> stratum_cp$stratp,method="**srswor") > >>> > >> > >> Error in data.frame(..., check.names = FALSE) : > >> > >> arguments imply differing number of rows: 0, 1 > >> > >> traceback() > >>> > >> > >> 5: stop("arguments imply differing number of rows: ", paste(unique(nrows), > >> > >> collapse = ", ")) > >> > >> 4: data.frame(..., check.names = FALSE) > >> > >> 3: cbind(deparse.level, ...) > >> > >> 2: cbind(r, i) > >> > >> 1: strata(CURRPOP, c("APPT_TYP_CD_LL", "EMPL_TYPE"), size = > >> stratum_cp$stratp, > >> > >> method = "srswor") > >> > >> > >> > >> #In lieu of a reproducible sample here is some info regarding most of my > >> data > >> dim(CURRPOP) > >> [1] 6280 11 > >> #Cols w/ personal info have been removed in this output > >> > >> str(CURRPOP[,c(1:3,7:11)]) > >>> > >> > >> 'data.frame': 6280 obs. of 8 variables: > >> > >> $ APPT_TYP_CD_LL: Factor w/ 12 levels "FA","FC","FP",..: 1 2 2 2 2 2 3 3 > >> 3 3 ... > >> > >> $ EMPL_TYPE : Factor w/ 2 levels "H","S": 2 2 2 2 2 2 2 2 2 2 ... > >> > >> $ ROWID : int 1 2 3 4 5 6 7 8 9 10 ... > >> > >> $ DEPTID : int 9825 9613 9613 9852 9772 9852 9853 9853 9853 9854 > >> ... > >> > >> $ JOBCODE : Factor w/ 325 levels "055.2","055.3",..: 311 112 112 > >> 112 112 112 298 299 299 300 ... > >> > >> $ JOBTITLE : Factor w/ 325 levels "Accounting Assistant",..: 227 192 > >> 192 192 192 192 190 191 191 153 ... > >> > >> $ SAL_ADMIN_PLAN: Factor w/ 40 levels "ADE","AME","ASE",..: 36 38 38 38 > >> 38 38 31 31 31 31 ... > >> > >> $ RET_TYP_CD_LL : Factor w/ 2 levels "TCP1","TCP2": 2 2 2 2 2 2 2 2 2 2 > >> ... > >> > >> Daniel Lopez > >> Workforce Analyst > >> HRIM - Workforce Analytics & Metrics > >> Strategic Human Resources Management > >> wf-analytics-metrics@lists.**llnl.gov<wf-analytics-metr...@lists.llnl.gov> > >> <mailto:wf-analytics-**metr...@lists.llnl.gov<wf-analytics-metr...@lists.llnl.gov> > >> > > >> (925) 422-0814 > >> > >> > >> [[alternative HTML version deleted]] > >> > >> ______________________________**________________ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/**listinfo/r- > help<https://stat.ethz.ch/mailman/listinfo/r-help> > >> PLEASE do read the posting guide http://www.R-project.org/** > >> posting-guide.html <http://www.R-project.org/posting-guide.html> > >> and provide commented, minimal, self-contained, reproducible code. > >> > >> > > ------------------------------**------------------------------** > > --------------- > > Jeff Newmiller The ..... ..... Go Live... > > DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live > > Go... > > Live: OO#.. Dead: OO#.. Playing > > Research Engineer (Solar/Batteries O.O#. #.O#. with > > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > > > > ______________________________**________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r- > help> > > PLEASE do read the posting guide http://www.R-project.org/** > > posting-guide.html <http://www.R-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Thomas Lumley > Professor of Biostatistics > University of Auckland > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.