I am trying to calculating the treatment effect for individual subjects ("ID")
of a ("score") between 2 time-points ("visit") (see example below).

The data is in an unbalanced data.frame in "long" format with some missing data.

I suspect that I am overlooking a very simple function, something along the lines of
tapply().

Thank you for you attention!


Derek Eder



##  Examples:

myData = data.frame(
  ID = c("a","a","b","c","c","d","d"),
  visit=c(1,2,1,1,2,1,2),
  score=c(10,2,12,16,0,NA,5)
  )

> myData
  ID visit score
1  a     1    10
2  a     2     2
3  b     1    12
4  c     1    16
5  c     2     0
6  d     1    NA
7  d     2     5

# The desired result is a vector of time differences by ID
#  a  b  c  d
#  8  NA 16 NA



##  solutions ?

# This works, but the returned data frame is awkward for me
# because the "empty cells" (b and d) contain integer(0)
# and not the more familiar NA.

> aggregate(data=myData, score~ID,FUN=diff)
  ID score
1  a    -8
2  b
3  c   -16
4  d


# This works as desired ... but somehow seems unecessarily complicated

> reshape(data=myData,timevar="visit",idvar="ID", direction="wide")
  ID score.1 score.2
1  a      10       2
3  b      12      NA
4  c      16       0
6  d      NA       5

> apply(X = reshape(data=myData,timevar="visit",idvar="ID", direction="wide")[,-1],
      MARGIN = 1, FUN = diff)

  1   3   4   6
 -8  NA -16  NA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to