Hi
I have a plist t which contains size measurements of individual plants,
identified by the field plate. It contains, among other, a field
year indicating the year in which the individual was measured and the
height. The number of measurements range from 1 to 4 measurements in
different
If you were worried about efficiency and the structure/size of the dataframe
was complex/big, then you could work with the indices only which would be
more efficient:
sapply(split(seq(nrow(t)), t$plate), function(x) t$id[x][which.max
(t$year[x])])
15 20 33 43 44 47
What is wrong with the method that you have? It looks reasonable
efficient. As with other languages, there are always other ways of doing
it. Here is another to consider, but it is basically the same:
sapply(split(t, t$plate), function(x) x$id[which.max(x$year)])
15 20 33
Finally I would like to have a data.frame t2 which only contains the
entries of the last measurements.
You could also use aggregate to get the max year per plate then join that back
to the original dataframe using merge on year and plate (common columns in both
dataframes).
Hi
jim holtman wrote:
What is wrong with the method that you have? It looks reasonable
Actually there is nothing wrong with the approach I am using - it just
seemed to be quite complicated and I assumed that there is an easier
approach around.
The dataset is not that large that I really have
Hi Chris
Chris Stubben wrote:
Finally I would like to have a data.frame t2 which only contains the
entries of the last measurements.
You could also use aggregate to get the max year per plate then join that back
to the original dataframe using merge on year and plate (common columns in