It appears that I owe Martin Maechler <[EMAIL PROTECTED]> an apology for not realising the importance of the context for what I quoted. I apologise.
but *please* note again the code snippet we where talking about : > dat <- sapply( seq(T-width), function(i) { > model <- lm(dlinrchf ~ dlusdchf + dljpychf + dldemchf, A, > i:(i+width-1)) > details <- summary.lm(model) > tmp <- coefficients(model) > c( USD = tmp[2], JPY = tmp[3], DEM = tmp[4], > R2 = details$r.squared, RMSE = details$sigma ) > } ) > dat <- as.data.frame(t(dat)) > attach(dat) which is really an example where sapply() rather obfuscates than clarifies. It's not clear to me that the choice of sapply() -vs- 'for' really has anything to do with it here. Hmm, maybe it does. Looking at this code, I can see at a glance that - dat will be a matrix - it will have columns 1:T-width - it will have rows USD, JPY, DEM, R2, RMSE - each column reflects one linear model and I don't have to decode a lot of indexed assignment statements to figure this out. The first way to improve clarity would be to use keyword parameters on the call to lm, e.g., lm(..., data = A, subset = i:(i+width-1)). The second way to improve clarity would be to use character indices on tmp rather than integer indices: coef <- coefficients(model) c(USD = coef["dlusdchf"], JPY = coef["djpychf"], DEM = coef["dldemchf"], R2 = details$r.squared, RMSE= details$sigma) Hmm. My "first" and "second" ways are both the same: use names rather than position. There is one more clarity improvement to recommend, and it has nothing to do with using or avoiding sapply(), at least not directly. # dfapply(X, FUN, ...) is like sapply() but # expects FUN to return c(x1=...,xn=...) vectors which it # turns into rows of the data frame that it returns. dfapply <- function (...) as.data.frame(t(sapply(...))) # Make "dat" a data frame with columns USD, JPY, DEM, R2, RMSE # and rows 1:T-width, the ith row extracted from a linear # regression on cases i:(i+width-1). dat <- dfapply(seq(T-width), function (i) { model <- lm(dlinrchf ~ dlusdchf + dljpychf + dldemchf, data = A, subset = i:(i+width-1)) s <- summary.lm(model) v <- coefficients(model) c(USD = v["dlusdchf"], JPY = v["djpychf"], DEM = v["dldemchf"], R2 = s$r.squared, RMSE = s$sigma) }) Now here's where using sapply() instead of 'for' does pay off, even here. We ask the question "where is 'i' used?" Because we're *not* using i in any visible index calculations, there is only one place that 'i' is used, and that's in the subset= argument of the lm() call. That prompts the question "is there any way to exploit the fact that the rest of the linear model is the same? Depending on the relative sizes of A and T-width, there may well be, and Statistical Models in S explains, if memory serves me, how to do this kind of thing. But without the fact that i is only used in one place, it might not be as obvious that it was worth thinking about. ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html