Jim Lemon <jim <at> bitwrit.com.au> writes: > > On 08/24/2013 04:16 AM, Michael Friendly wrote: > > For sequential analysis of sequences of events, I want to calculate a > > series of lagged > > versions of a (numeric or character) variable. The simple function below > > does this, > > but I can't see how to generalize this to the case where there is also a > > factor variable > > and I want to calculate lags separately for each level of the factor > > (by). Can anyone help? > > ...
[snip] > > > > > > Hi Michael, > Maybe this will do it. > > lags <- function(x, k=1, prefix='lag', by) { > if(missing(by)) { > n <- length(x) > res <- data.frame(lag0=x) > for (i in 1:k) { > res <- cbind(res, c(rep(NA, i), x[1:(n-i)])) > } > colnames(res) <- paste0(prefix, 0:k) > return(res) > } > else { > for(levl in levels(by)) { > nextlags<-lags(x[by==levl,],prefix=prefix) > rownames(nextlags)<-paste(levl,rownames(nextlags),sep=".") > if(exist(res)) res<-rbind(res,nextlags) > else res<-nextlags > } > } > } > > Jim Untested? I get > lags(mtcars$mpg,2) lag0 lag1 lag2 1 21.0 NA NA 2 21.0 21.0 NA 3 22.8 21.0 21.0 4 21.4 22.8 21.0 5 18.7 21.4 22.8 6 18.1 18.7 21.4 7 14.3 18.1 18.7 [ ... ] which looks ok and > lags(mtcars$mpg,2,by=factor(mtcars$cyl)) Error in x[by == levl, ] : incorrect number of dimensions > Michael, try this: lagframe <- function(x,k=1,prefix='lag',by){ lag.one <- function(x) c(NA,head(x,-1)) indx <- if (missing(by)) lag.one(seq_along(x)) else { spl.by <- split(seq_along(by),by) lag.spl.by <- lapply(spl.by, lag.one ) unsplit(lag.spl.by,by) } res <- setNames(data.frame(x), paste0(prefix,"0") ) for (i in 1:k) res[[ paste0(prefix,i) ]] <- res[[ paste0(prefix,i-1) ]][ indx ] res } > lags(mtcars$mpg,2) lag0 lag1 lag2 1 21.0 NA NA 2 21.0 21.0 NA 3 22.8 21.0 21.0 4 21.4 22.8 21.0 5 18.7 21.4 22.8 [...] > cbind( lagframe(mtcars$mpg,2,by=mtcars$cyl), cyl=mtcars$cyl) lag0 lag1 lag2 cyl 1 21.0 NA NA 6 2 21.0 21.0 NA 6 3 22.8 NA NA 4 4 21.4 21.0 21.0 6 5 18.7 NA NA 8 6 18.1 21.4 21.0 6 7 14.3 18.7 NA 8 8 24.4 22.8 NA 4 9 22.8 24.4 22.8 4 10 19.2 18.1 21.4 6 11 17.8 19.2 18.1 6 12 16.4 14.3 18.7 8 [...] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.