"...from an SQL database."    How? Structure of the result?

You say "ordered by date" but then you don't reference any date variable? And your code creates an "order" column, but that would not appear necessary for the stated purpose and you don't output the last "order" within a "person_id".

See if the tapply strategy below gives you alternate approaches (which should work for the typical default numbering of data.frame rows):

DF <-structure(list(Month = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 3L, 3L, 3L, 3L, 3L), .Label = c("Aug", "July", "Sept"), class = "factor"),
    Week = 27:39, Estpassage = c(665L, 2232L, 9241L, 28464L,
    41049L, 82216L, 230411L, 358541L, 747839L, 459682L, 609567L,
    979475L, 837189L), MedFL = c(34L, 35L, 35L, 35L, 35L, 35L,
    35L, 35L, 35L, 36L, 36L, 36L, 36L)), .Names = c("Month",
"Week", "Estpassage", "MedFL"), class = "data.frame", row.names = c(NA,
-13L))

> DF
   Month Week Estpassage MedFL
1   July   27        665    34
2   July   28       2232    35
3   July   29       9241    35
4   July   30      28464    35
5    Aug   31      41049    35
6    Aug   32      82216    35
7    Aug   33     230411    35
8    Aug   34     358541    35
9   Sept   35     747839    35
10  Sept   36     459682    36
11  Sept   37     609567    36
12  Sept   38     979475    36
13  Sept   39     837189    36

tapply(as.numeric(rownames(DF)), DF$Month, max)
# substitute history for DF,  and history$person_id  for DF$Month

 Aug July Sept
   8    4   13

You need the as.numeric around the rownames to prevent the alpha interpretation of "maximum" from being used. The by() and aggregate() functions are convenience functions using tapply, but I have more success with tapply itself.

I rather wonder what use this is to be put to, and whether there might be more efficient overall approach. It is not that typical to need the location of the last member of a group within a dataframe.

--
David


On Feb 27, 2009, at 2:02 PM, Andrew Ziem wrote:

I want to find the last record for each person_id in a data frame
(from a SQL database) ordered by date.  Is there a better way than
this for loop?

for (i in 2:length(history[,1])) {
   if (history[i, "person_id"] == history[i - 1, "person_id"])
     history[i, "order"] = history[i - 1, "order"] + 1 # same person
   else
     history[i, "order"] = 1 # new person
}

# ignore all records except the last for each con_id
history2 <- subset(history, order == 1)


Andrew

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to