Congratulations! Could you explain to me the reason you add an initial "TRUE" value in the cumulatice sum?
jholtman wrote: > > Will this work: > >> x <- read.table(textConnection(" day user_id > + 2008/11/01 2001 > + 2008/11/01 2002 > + 2008/11/01 2003 > + 2008/11/01 2004 > + 2008/11/01 2005 > + 2008/11/02 2001 > + 2008/11/02 2005 > + 2008/11/03 2001 > + 2008/11/03 2003 > + 2008/11/03 2004 > + 2008/11/03 2005 > + 2008/11/04 2001 > + 2008/11/04 2003 > + 2008/11/04 2004 > + 2008/11/04 2005"), header=TRUE) >> closeAllConnections() >> # convert to Date >> x$day <- as.Date(x$day, format="%Y/%m/%d") >> # split by user and then look for contiguous days >> contig <- sapply(split(x$day, x$user_id), function(.days){ > + .diff <- cumsum(c(TRUE, diff(.days) != 1)) > + max(table(.diff)) > + }) >> contig > 2001 2002 2003 2004 2005 > 4 1 2 2 4 >> >> > > > On Thu, Oct 1, 2009 at 11:29 AM, gd047 <gd...@mineknowledge.com> wrote: >> >> ...if that is possible >> >> My task is to find the longest streak of continuous days a user >> participated >> in a game. >> >> Instead of writing an sql function, I chose to use the R's rle function, >> to >> get the longest streaks and then update my db table with the results. >> >> The (attached) dataframe is something like this: >> >> day user_id >> 2008/11/01 2001 >> 2008/11/01 2002 >> 2008/11/01 2003 >> 2008/11/01 2004 >> 2008/11/01 2005 >> 2008/11/02 2001 >> 2008/11/02 2005 >> 2008/11/03 2001 >> 2008/11/03 2003 >> 2008/11/03 2004 >> 2008/11/03 2005 >> 2008/11/04 2001 >> 2008/11/04 2003 >> 2008/11/04 2004 >> 2008/11/04 2005 >> >> >> >> --- R code follows >> ------------------------------------------------------ >> >> >> # turn it to a contingency table >> my_table <- table(user_id, day) >> >> # get the streaks >> rle_table <- apply(my_table,1,rle) >> >> # verify the longest streak of "1"s for user 2001 >> # as.vector(tapply(rle_table$'2001'$lengths, rle_table$'2001'$values, >> max)["1"]) >> >> # loop to get the results >> # initiate results matrix >> res<-matrix(nrow=dim(my_table)[1], ncol=2) >> >> for (i in 1:dim(my_table)[1]) { >> string <- paste("as.vector(tapply(rle_table$'", rownames(my_table)[i], >> "'$lengths, rle_table$'", rownames(my_table)[i], "'$values, max)['1'])", >> sep="") >> res[i,]<-c(as.integer(rownames(my_table)[i]) , eval(parse(text=string))) >> } >> >> >> ---------------------------------------------------- >> --- end of R code >> >> Unfortunately this for loop takes too long and I' wondering if there is a >> way to produce the res matrix using a function from the "apply" family. >> >> Thank you in advance >> -- >> View this message in context: >> http://www.nabble.com/Help-me-replace-a-for-loop-with-an-%22apply%22-function-tp25696937p25696937.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://www.nabble.com/Help-me-replace-a-for-loop-with-an-%22apply%22-function-tp25696937p25704683.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.