On Thu, Nov 20, 2008 at 2:20 AM, Dieter Menne <[EMAIL PROTECTED]> wrote: > pufftissue pufftissue <pufftissue <at> gmail.com> writes: > >> >> What I am getting is indeed: >> >> 7200 23955 34563 8934 >> 16.39977 10.03896 11.234 14.02 >> >> I'd like the final output to be: >> >> subject_id hr_Stand_Deviation >> 7200 16.39977 >> 23955 10.03896 >> 34563 11.234 >> 8934 14.02 >> > > The hard way could go like that; I personally got used to it, but I admit > it is one of the thinks that are unusually difficult in R. > > dat = data.frame(SUBJECT_ID=sample(letters[1:5],100,TRUE),HR=rnorm(100)) > sd.list = with(dat, tapply(HR, SUBJECT_ID, sd)) > data.frame(SUBJECT_ID=rownames(sd.list),sd=sd.list) > > I think Hadley Wickham tried to make life easier with the plyr package, > so I thought something like the below would work out of the box. > However, there must be something wrong with the syntax, the > result is only "approximately" correct. > > Dieter > > library(plyr) > daply(dat,.(SUBJECT_ID),sd) > ddply(dat,.(SUBJECT_ID),sd)
Well that calculates sd on the whole data frame. (Like sd(dat)). You probably want: ddply(dat,.(SUBJECT_ID), numcolwise(sd)) which calculates sd for numeric columns only, or ddply(dat,.(SUBJECT_ID), function(df) sd(df$HR)) which calculates it for HR explicitly. Hadley -- http://had.co.nz/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.