Assuming that yr and conf are the two factors referred to in the description, create a function f which calculates the ith row of the output and use sapply like this:
attach(data) f <- function(i) { hsgpa <- na.omit(hsgpa[-i][conf[-i] == conf[i] & yr[-i] == yr[i]]) if (length(hsgpa)) c(mean = mean(hsgpa), var = var(hsgpa)) else c(mean = NA, var = NA) } out <- t(sapply(1:nrow(data), f)) On 6/12/06, David Kling <[EMAIL PROTECTED]> wrote: > Hello: > > I hope none of you will mind helping a newbie. I'm a student research > assistant working with a large data set in which observations are > categorized according to two factors. I'm trying to calculate the group > mean and variance of a variable (called 'hsgpa' in the example data > presented below) to each observation , excluding that observation. For > example, if there are 20 observations with the same value of the two > factors, for each of the 20 I'd like to generate the mean and variance > of the 'hsgpa' values of the other 19 group members. This must be done > for every observation in the data set. > > I've searched the R mail archives, read the manuals, and read > documentation for tapply() andby() as well as summaryBy() in the 'doBy' > package and with() from 'Hmisc.' It may be that since I'm new to > writing functions and R is the first language I've ever worked with I'm > less able to come up with a solution than some other new R users. None > of the functions I have tried have been succesful, and it doesn't seem > worth it to reproduce and explain my best effort. I hope someone has > some ideas! Looking at what an experienced user would try should help > me with my present task as well as future problems. > > Below I've included some lines that will generate a sample data set > similar to the one I'm working with: > > # > #Example data: > # > case <- sample(seq(1,10000,1),5000,replace=FALSE) > hsgpa <- rbeta(5000,7,1.5)*4.25 > yr <- sample(seq(1993,2005,1),5000,replace=TRUE) > conf <- sample(letters[1:5],5000,replace=TRUE) > data <- data.frame(case=case,hsgpa=hsgpa,yr=yr,conf=conf) > data$conf <- as.character(data$conf) > s1 <- sample(seq(1,5000,1),500,replace=FALSE) > k <- data$hsgpa > k[row.names(data) %in% s1] <- NA > data$hsgpa <- k > s2 <- sample(seq(1,5000,1),100,replace=FALSE) > k <- data$yr > k[row.names(data) %in% s2] <- NA > data$yr <- k > k <- data$conf > k[row.names(data) %in% s2] <- NA > data$conf <- k > remove(case,hsgpa,yr,conf,s1,s2,k) > # > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html