On May 6, 2011, at 3:15 PM, Christopher G Oakley wrote: > Is there a way to generate a new dataframe that produces x lines based on the > contents of a column? > > for example: I would like to generate a new dataframe with 70 lines of > data[1, 1:3], 67 lines of data[2, 1:3], 75lines of data[3,1:3] and so on up > to numrow = sum(count). > >> data > > pop fam yesorno count > 1 126 1 70 > 1 127 1 67 > 1 128 1 75 > 1 126 0 20 > 1 127 0 23 > 1 128 0 15 > > > Thanks, > > Chris
# Better not to use 'data' as the name of an R object to avoid # confusion with certain functions where 'data' is the name of # an argument, such as regression models. R is smart enough # to generally know the difference, but it can make reading code # less confusing > DF pop fam yesorno count 1 1 126 1 70 2 1 127 1 67 3 1 128 1 75 4 1 126 0 20 5 1 127 0 23 6 1 128 0 15 Use rep() to generate a vector of repeated indices (?rep): > rep(1:nrow(DF), DF$count) [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [34] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [67] 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [100] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [133] 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 [166] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 [199] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 [232] 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 [265] 6 6 6 6 6 6 > table(rep(1:nrow(DF), DF$count)) 1 2 3 4 5 6 70 67 75 20 23 15 Now use that vector as input: DF.New <- DF[rep(1:nrow(DF), DF$count), 1:3] > str(DF.New) 'data.frame': 270 obs. of 3 variables: $ pop : int 1 1 1 1 1 1 1 1 1 1 ... $ fam : int 126 126 126 126 126 126 126 126 126 126 ... $ yesorno: int 1 1 1 1 1 1 1 1 1 1 ... > with(DF.New, table(fam, yesorno)) yesorno fam 0 1 126 20 70 127 23 67 128 15 75 If you might need something more generalized to handle generating 'raw' data of various types from a contingency table, search the list archives for the function "expand.dft", which I posted a few years ago and I think found its way into a couple of CRAN packages. HTH, Marc Schwartz ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.