Certainly this has been recognized as a potential problem: http://developer.r-project.org/nonstandard-eval.pdf
however, it is convenient when you are performing an analysis and entering commands directly as opposed to writing a program although possibly the potential ambiguities overshadow the convenience. On Mon, Nov 10, 2008 at 2:04 PM, Wacek Kusnierczyk <[EMAIL PROTECTED]> wrote: > pardon me, but does this address in any way the legitimate complaint of > the rightfully confused user? > > consider the following: > > d = data.frame(a=1, b=2) > a = c("a", "b") > z = a > # that is, both a and z are c("a", "b") > > subset(d, select=z) > # gives two columns, since z is a two element vector whose elements are > valid column names > > subset(d, select=a) > # gives one column, since 'a' (but not a) is a valid column name > > subset(d, select=c(a,b)) > # gives two columns > > > this is certainly what the authors intended, and they may have good > grounds for this smart design. but this must break the expectation of a > naive (r-naive, for that matter) user, who may otherwise have excellent > experience in using a functional programming language, e.g., scheme. > (especially scheme, where symbols and expressions are first-class > objects, yet the distinction between a symbol or an expression and their > referent is made painfully clear, perhaps except for when one hacks with > macros.) > > the examples above illustrate the notorious problem with r that one can > never tell whether 'a' means "the value referred to with the identifier > 'a'" or "the symbol 'a'", unless one gets ugly surprises and is forced > to study the documentation. and even then one may not get a clear answer. > > the example given by the confused user is a red flag warning. it's a > typical abstraction where a nested sequence of operations (here print > over names over subset) is abstracted into a single procedure, which can > be called with whatever arguments are valid: > > pns = function(d, g) print(names(subset(d, select=g))) > > what sane person, without carefully studying the gory details of subset, > will ever expect that if the first argument happens to have a column > named 'g', only this one will be selected, while if it doesn't, subset > will select the columns named by the components of what 'g' evaluates > to. i wonder how many users have *not* noticed that what they get is > not what they assume they get because of such tricky tricks, and in > consequence were not able to publish their analyses (or worse, have > published them). > > what is scary is that this may happen with about any other function in > r, because the design is pervasive. no one should ever use any r > function without first carefully reading the docs (which is not > guaranteed to help) or trying it first on a number of carefully crafted > test cases. if such care is not taken, results obtained with r cannot > be taken seriously. > > > vQ > > > Gabor Grothendieck wrote: >> Forgot the name part. Try: >> >> TestFunc2 <- function(DF, group) names(DF[group]) >> TestFunc3 <- function(...) names(subset(..., subset = TRUE)) >> TestFunc4 <- function(...) eval.parent(names(subset(..., subset = TRUE))) >> >> # e.g. >> df1 <- data.frame(group = "G1", visit = "V1", value = 0.9) >> TestFunc2(df1, c("group", "visit")) >> TestFunc3(df1, c("group", "visit")) >> TestFunc4(df1, c("group", "visit")) >> TestFunc4(df1, c(group, visit)) # this works too >> >> On Mon, Nov 10, 2008 at 10:43 AM, Gabor Grothendieck >> <[EMAIL PROTECTED]> wrote: >> >>> Here are a few things to try: >>> >>> TestFunc1 <- get("[") >>> >>> TestFunc2 <- function(DF, group) DF[group] >>> >>> TestFunc3 <- function(...) subset(..., subset = TRUE) >>> >>> >>> >>> On Mon, Nov 10, 2008 at 10:18 AM, Karl Knoblick <[EMAIL PROTECTED]> wrote: >>> >>>> Hello! >>>> >>>> I have the problem that in my function the passed variable is not used, >>>> but the variable name of the dataframe itself - difficult to explain, but >>>> an easy example: >>>> >>>> TestFunc<-function(df, group) { >>>> print(names(subset(df, select=group))) >>>> } >>>> df1<-data.frame(group="G1", visit="V1", value=0.9) >>>> TestFunc(df1, c("group", "visit")) >>>> >>>> Result: >>>> [1] "group" >>>> >>>> But I expected and want to have [1] "group" "visit" as result! Does >>>> anybody know how to get this result? >>>> >>>> Thanks! >>>> Karl >>>> > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.