Hi: This is an abridged version of the reply I sent privately to the OP.
#### Generate an artificial data frame # function to randomly generate one of the Q* columns with length 1000 mysamp <- function() sample(c(-1, 0, 1, NA), 1000, prob = c(0.35, 0.2, 0.4, 0.05), replace = TRUE) # use above function to randomly generate 10 questions and assign them names in the workspace for(i in 1:10) assign(paste('Q', i, sep = ''), mysamp()) # create a data frame from the generate questions C <- data.frame(time = rep(1:4, each = 250), sector = sample(LETTERS[1:6], 1000, replace = TRUE), Q1, Q2, Q3, Q4, Q5, Q6, Q7, Q8, Q9, Q10) #### # A function to generate the scores from the combined questions # for an arbitrary input data frame d: scorefun <- function(d) { dm <- matrix(unlist(apply(d, 2, table)[-(1:2)]), nrow = 3) tsums <- cbind(rowSums(dm[, 1:3]), dm[, 4], rowSums(dm[, 5:6]), rowSums(dm[, 7:8]), rowSums(dm[, 9:10]) ) dprop <- function(x) (x[3] - x[1])/sum(x) 100 * (1 + apply(tsums, 2, dprop)) } library(plyr) # Apply scorefun() to each sub-data frame corresponding to time-sector combinations ddply(C, .(time, sector), scorefun) Dennis On Sat, Jan 8, 2011 at 10:19 PM, Kari Manninen <k...@econadvisor.com> wrote: > This is my first post to R-help and I look forward receiving some advice > for a novice like me... > > Ive got a simple repeated (4 periods so far) 10-question survey data that > is very easy to work on Excel. However, Id like to move the compilation to > R but Im having some trouble operating on count list data in a neat way. > > The data C > >> str(C) >> > 'data.frame': 551 obs. of 13 variables: > $ TIME : int 1 1 1 1 1 1 1 1 1 1 ... > $ Sector : Factor w/ 6 levels "D","F","G","H",..: 1 1 1 1 1 1 1 1 1 1 ... > $ COMP : Factor w/ 196 levels " (_____ __ _____) ",..: 73 133 128 109 > 153 147 56 26 142 34 ... > $ Q1 : int 0 0 1 1 0 -1 -1 1 1 -1 ... > $ Q2 : int 0 0 0 -1 0 -1 0 0 1 -1 ... > $ Q3 : int 0 0 0 1 0 -1 -1 1 1 -1 ... > $ Q4 : int -1 0 0 0 0 -1 0 -1 0 -1 ... > $ Q5 : int 0 0 0 -1 0 -1 0 -1 0 0 ... > $ Q6 : int 0 0 0 1 0 -1 0 -1 0 0 ... > $ Q7 : int 0 1 1 0 0 0 1 0 1 1 ... > $ Q8 : int 0 0 0 0 0 -1 0 0 1 0 ... > $ Q9 : int 0 1 0 0 0 -1 0 -1 1 -1 ... > $ Q10 : int 0 0 0 0 -1 -1 0 -1 0 0 ... > > summary(C) >> > TIME Sector COMP Q1 Q2 > Min. :1.000 D:130 A: 4 Min. :-1.000 Min. :-1.0000 > 1st Qu.:2.000 F:126 B: 4 1st Qu.: 0.000 1st Qu.: 0.0000 > Median :3.000 G:158 C: 4 Median : 1.000 Median : 0.0000 > Mean :2.684 H: 26 D: 4 Mean : 0.446 Mean : 0.2178 > 3rd Qu.:4.000 I: 20 E: 4 3rd Qu.: 1.000 3rd Qu.: 1.0000 > Max. :4.000 J: 91 F: 4 Max. : 1.000 Max. : 1.0000 > (Other):527 NA's :60.000 NA's :69.0000 > > > The aim is to produce balance scores between positive and negative answers > shares in the data. First counts of -1, 0 and 1 (negative, neutral, > positive) and missing NA (it would be som much simple without the missing > values) for each question Q1-Q10 for each period (TIME) in 6 Sectors: > > b<-apply(C[,4:13], 2, function (x) tapply(x,C[,1:2], count)) > > I know that b is a list of data.frames dim(4x6) for each question, where > each cell is a count list. > > For example, for Question 1, Time period 2, Sector 1: > >> str(b$Q1[2,1]) >> > List of 1 > $ :data.frame: 4 obs. of 2 variables: > ..$ x : int [1:4] -1 0 1 NA > ..$ freq : int [1:4] 3 9 12 2 > > Now I would like to group questions (C[, 4:6], C[, 7], C[8:9], C[10:11] > and C[, 12:13]) and sum counts (-1, 0, 1) for these groups and present > them in percentage terms. I dont know how to this efficiently for the whole > data. I would not like to go through each cell separately > > Then Id give each group a balance score based on something like: > > Score = 100 + 100*[ pos% - neg%] for each group by TIME, Sector, while > excluding the missing observations. > > ### This is not working > Score <- 100 + 100*[sum(count( =="1")/sum(count(list( "-1", "0","1") - > sum(count( =="-1")/sum(count(list( "-1", "0","1")] for each 5 groups > defined above and by TIME, Sector > > I would greatly appreciate your help on this. > > Regards, > - Kari Manninen > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.