Try this one ff <- function (expr) { if (is.call(expr) && is.name(expr[[1]]) && is.element(as.character(expr[[1]]), c("~", "+", "-", "*", "/", ":", "%in%"))) { # the above list should cover the standard formula operators. for (i in seq_along(expr)[-1]) { expr[[i]] <- Recall(expr[[i]]) } } else if (is.name(expr)) { # the conversion itself expr <- as.name(paste0(toupper(as.character(expr)), "z")) } expr }
> ff(a) CATz + (age + Heading("Females") * (sex == "Female") * sbp) * Heading() * Gz + (age + sbp) * Heading() * TRIOz ~ Heading() * COUNTRYz * Heading() * SEXz Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of Frank Harrell > Sent: Thursday, August 15, 2013 4:45 PM > To: RHELP > Subject: Re: [R] regex challenge > > I really appreciate the excellent ideas from Bill Dunlap and Greg Snow. > Both suggestions almost work perfectly. Greg's recognizes expressions > such as sex=='female' but not ones such as age > 21, age < 21, a - b > > 0, and possibly other legal R expressions. Bill's idea is similar to > what Duncan Murdoch suggested to me. Bill's doesn't catch the case when > a variable appears both in an expression and as a regular variable (sex > in the example below): > > f <- function(formula) { > trms <- terms(formula) > variables <- as.list(attr(trms, "variables"))[-1] > ## the 'variables' attribute is stored as a call to list(), > ## so we changed the call to a list and removed the first element > ## to get the variables themselves. > if (attr(trms, "response") == 1) { > ## terms does not pull apart right hand side of formula, > ## so we assume each non-function is to be renamed. > responseVars <- lapply(all.vars(variables[[1]]), as.name) > variables <- variables[-1] > } else { > responseVars <- list() > } > ## omit non-name variables from list of ones to change. > ## This is where you could expand calls to certain functions. > variables <- variables[vapply(variables, is.name, TRUE)] > variables <- c(responseVars, variables) # all are names now > names(variables) <- vapply(variables, as.character, "") > newVars <- lapply(variables, function(v) as.name(paste0(toupper(v), > "z"))) > formula(do.call("substitute", list(formula, newVars)), > env=environment(formula)) > } > > a <- cat + (age + Heading("Females") * (sex == "Female") * sbp) * > Heading() * g + (age + sbp) * Heading() * trio ~ Heading() * > country * Heading() * sex > f(a) > > Output: > > CATz + (AGEz + Heading("Females") * (SEXz == "Female") * SBPz) * > Heading() * Gz + (AGEz + SBPz) * Heading() * TRIOz ~ Heading() * > COUNTRYz * Heading() * SEXz > > The method also doesn't work if I replace sex == 'Female' with x3 > 4, > converting to X3z > 4. I'm not clear on how to code what kind of > expressions to ignore. > > Thanks! > Frank > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.