I need some help with a formula processing problem that arose from a seemingly innocuous request that I add a “subset” argument to the additive modeling function “rqss” in my quantreg package.
I’ve tried to boil the relevant code down to something simpler as illustrated below. The formulae in question involve terms called “qss” that construct sparse matrix objects, but I’ve replaced all that with a much simpler BoxCox construction that I hope illustrates the basic difficulty. What is supposed to happen is that xss objects are evaluated and cbind’d to the design matrix, subject to the same subset restriction as the rest of the model frame. However, this doesn’t happen, instead the xss vectors are evaluated on the full sample and the cbind operation generates a warning which probably should be an error. I’ve inserted a browser() to make it easy to verify that the length of xss[[[1]] doesn’t match dim(X). Any suggestions would be most welcome, including other simplifications of the code. Note that the function untangle.specials() is adapted, or perhaps I should say adopted form the survival package so you would need the quantreg package to run the attached code. Thanks, Roger fit <- function(formula, subset, data, ...){ call <- match.call() m <- match.call(expand.dots = FALSE) tmp <- c("", "formula", "subset", "data") m <- m[match(tmp, names(m), nomatch = 0)] m[[1]] <- as.name("model.frame") Terms <- if(missing(data)) terms(formula,special = "qss") else terms(formula, special = "qss", data = data) qssterms <- attr(Terms, "specials")$qss if (length(qssterms)) { tmpc <- untangle.specials(Terms, "qss") dropx <- tmpc$terms if (length(dropx)) Terms <- Terms[-dropx] attr(Terms, "specials") <- tmpc$vars fnames <- function(x) { fy <- all.names(x[[2]]) if (fy[1] == "cbind") fy <- fy[-1] fy } fqssnames <- unlist(lapply(parse(text = tmpc$vars), fnames)) qssnames <- unlist(lapply(parse(text = tmpc$vars), function(x) deparse(x[[2]]))) } if (exists("fqssnames")) { ffqss <- paste(fqssnames, collapse = "+") ff <- as.formula(paste(deparse(formula), "+", ffqss)) } m$formula <- Terms m <- eval(m, parent.frame()) Y <- model.extract(m, "response") X <- model.matrix(Terms, m) ef <- environment(formula) qss <- function(x, lambda) (x^lambda - 1)/lambda if (length(qssterms) > 0) { xss <- lapply(tmpc$vars, function(u) eval(parse(text = u), m, enclos = ef)) for(i in 1:length(xss)){ X <- cbind(X, xss[[i]]) # Here is the problem } } browser() z <- lm.fit(X,Y) # The dreaded least squares fit z } # Test case n <- 200 x <- sort(rchisq(n,4)) z <- rnorm(n) s <- sample(1:n, n/2) y <- log(x) + rnorm(n)/5 D = data.frame(y = y, x = x, z = z, s = (1:n) %in% s) plot(x, y) lam = 0.2 #f0 <- fit(y ~ qss(x,lambda = lam) + z, subset = s) f1 <- fit(y ~ qss(x, lambda = lam) + z, subset = s, data = D) ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.