Here is the solution using sqldf which can do it in one statement: > # read in data > Lines <- "OBS NAME SCORE + 1 Tom 92 + 2 Tom 88 + 3 Tom 56 + 4 James 85 + 5 James 75 + 6 James 32 + 7 Dawn 56 + 8 Dawn 91 + 9 Clara 95 + 10 Clara 84" > > DF <- read.table(textConnection(Lines), header = TRUE) > > # run > library(sqldf) > sqldf("select NAME, avg(SCORE) from DF group by NAME having count(*) = 3") NAME avg(SCORE) 1 James 64.00000 2 Tom 78.66667
On Tue, Jan 5, 2010 at 2:03 PM, Gabor Grothendieck <ggrothendi...@gmail.com> wrote: > Have a look at this post and the rest of that thread: > > https://stat.ethz.ch/pipermail/r-help/2010-January/223420.html > > On Tue, Jan 5, 2010 at 1:29 PM, Geoffrey Smith <g...@asu.edu> wrote: >> Hello, does anyone know how to take the mean for a subset of observations? >> For example, suppose my data looks like this: >> >> OBS NAME SCORE >> 1 Tom 92 >> 2 Tom 88 >> 3 Tom 56 >> 4 James 85 >> 5 James 75 >> 6 James 32 >> 7 Dawn 56 >> 8 Dawn 91 >> 9 Clara 95 >> 10 Clara 84 >> >> Is there a way to get the mean of the SCORE variable by NAME but only when >> the number of observations is equal to 3? In other words, is there a way to >> get the mean of the SCORE variable for Tom and James, but not for Dawn and >> Clara? Thank you. >> >> -- >> Geoffrey Smith >> Visiting Assistant Professor >> Department of Finance >> W. P. Carey School of Business >> Arizona State University >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.