In a research project we are using a web-based tools for collecting data from questionnaire. The system generates files that are simple to read as a data frame in the "long" format, which are simple to convert to the "wide" format.

Something that might happen are: (a) there are two (multiple) references to the same cell, and (b) if there are missing values? So, the data set has two references to S2/T2 and none to the S2/T1 combination:

> d
     values person time
  1       1     S1   T1
  2       2     S1   T2
  3       3     S1   T3
  4       4     S1   T4
  5      22     S2   T2
  6       6     S2   T2
  7       7     S2   T3
  8       8     S2   T4
  9       9     S3   T1
  10     10     S3   T2
  11     11     S3   T3
  12     12     S3   T4
reshape (d, idvar="person", v.names=c("values"), timevar="time", direction="wide")
   person values.T1 values.T2 values.T3 values.T4
 1     S1         1         2         3         4
 5     S2        NA        22         7         8
 9     S3         9        10        11        12

The missing cell gets an NA as expected. But the surprise is in the case where there are two references to the same cell. The the *first* is used (22 rather than 6).

Is there some way of forcing reshape () to use the *last* value?

Tom

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to