I basically agree with Rui - using substitute will cause trouble. E.g., how would the user iterate over the columns, calling your function for each? for(column in dataFrame) func(column) would fail because dataFrame$column does not exist. You need to provide an extra argument to handle this case. something like the following: func <- function(df, columnAsName,, columnAsString = deparse(substitute(columnAsName))[1]) ... } The default value of columnAsString should also deal with the case that the user supplied something like log(Conc.) instead of Conc.
I think that using a formula for the lazily evaluated argument (columnAsName) works well. The user then knows exactly how it gets evaluated. Bill Dunlap TIBCO Software wdunlap tibco.com On Tue, Dec 6, 2016 at 6:28 AM, John Sorkin <jsor...@grecc.umaryland.edu> wrote: > Over my almost 50 years programming, I have come to believe that if one > wants a program to be useful, one should write the program to do as much > work as possible and demand as little as possible from the user of the > program. In my opinion, one should not ask the person who uses my function > to remember to put the name of the data frame column in quotation marks. > The function should be written so that all that needs to be passed is the > name of the column; the function should take care of the quotation marks. > Jihny > > > John David Sorkin M.D., Ph.D. > > Professor of Medicine > > Chief, Biostatistics and Informatics > > University of Maryland School of Medicine Division of Gerontology and > Geriatric Medicine > > Baltimore VA Medical Center > > 10 North Greene Street > > GRECC (BT/18/GR) > > Baltimore, MD 21201-1524 > > (Phone) 410-605-7119 > > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > > > On Dec 6, 2016, at 3:17 AM, Rui Barradas <ruipbarra...@sapo.pt> wrote: > > > > Hello, > > > > Just to say that I wouldn't write the function as John did. I would get > > rid of all the deparse/substitute stuff and instinctively use a quoted > > argument as a column name. Something like the following. > > > > myfun <- function(frame, var){ > > [...] > > col <- frame[, var] # or frame[[var]] > > [...] > > } > > > > myfun(mydf, "age") # much better, simpler, no promises. > > > > Rui Barradas > > > > Em 05-12-2016 21:49, Bert Gunter escreveu: > >> Typo: "lazy evaluation" not "lay evaluation." > >> > >> -- Bert > >> > >> > >> > >> Bert Gunter > >> > >> "The trouble with having an open mind is that people keep coming along > >> and sticking things into it." > >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> > >> > >>> On Mon, Dec 5, 2016 at 1:46 PM, Bert Gunter <bgunter.4...@gmail.com> > wrote: > >>> Sorry, hit "Send" by mistake. > >>> > >>> Inline. > >>> > >>> > >>> > >>>> On Mon, Dec 5, 2016 at 1:34 PM, Bert Gunter <bgunter.4...@gmail.com> > wrote: > >>>> Inline. > >>>> > >>>> -- Bert > >>>> > >>>> > >>>> Bert Gunter > >>>> > >>>> "The trouble with having an open mind is that people keep coming along > >>>> and sticking things into it." > >>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >>>> > >>>> > >>>>> On Mon, Dec 5, 2016 at 9:53 AM, Rui Barradas <ruipbarra...@sapo.pt> > wrote: > >>>>> Hello, > >>>>> > >>>>> Inline. > >>>>> > >>>>> Em 05-12-2016 17:09, David Winsemius escreveu: > >>>>>> > >>>>>> > >>>>>>> On Dec 5, 2016, at 7:29 AM, John Sorkin < > jsor...@grecc.umaryland.edu> > >>>>>>> wrote: > >>>>>>> > >>>>>>> Rui, > >>>>>>> I appreciate your suggestion, but eliminating the deparse > statement does > >>>>>>> not solve my problem. Do you have any other suggestions? See code > below. > >>>>>>> Thank you, > >>>>>>> John > >>>>>>> > >>>>>>> > >>>>>>> mydf <- > >>>>>>> data.frame(id=c(1,2,3,4,5),sex=c("M","M","M","F","F"), > age=c(20,34,43,32,21)) > >>>>>>> mydf > >>>>>>> class(mydf) > >>>>>>> > >>>>>>> > >>>>>>> myfun <- function(frame,var){ > >>>>>>> call <- match.call() > >>>>>>> print(call) > >>>>>>> > >>>>>>> > >>>>>>> indx <- match(c("frame","var"),names(call),nomatch=0) > >>>>>>> print(indx) > >>>>>>> if(indx[1]==0) stop("Function called without sufficient > arguments!") > >>>>>>> > >>>>>>> > >>>>>>> cat("I can get the name of the dataframe as a text string!\n") > >>>>>>> #xx <- deparse(substitute(frame)) > >>>>>>> print(xx) > >>>>>>> > >>>>>>> > >>>>>>> cat("I can get the name of the column as a text string!\n") > >>>>>>> #yy <- deparse(substitute(var)) > >>>>>>> print(yy) > >>>>>>> > >>>>>>> > >>>>>>> # This does not work. > >>>>>>> print(frame[,var]) > >>>>>>> > >>>>>>> > >>>>>>> # This does not work. > >>>>>>> print(frame[,"var"]) > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> # This does not work. > >>>>>>> col <- xx[,"yy"] > >>>>>>> > >>>>>>> > >>>>>>> # Nor does this work. > >>>>>>> col <- xx[,yy] > >>>>>>> print(col) > >>>>>>> } > >>>>>>> > >>>>>>> > >>>>>>> myfun(mydf,age) > >>>>>> > >>>>>> > >>>>>> > >>>>>> When you use that calling syntax, the system will supply the values > of > >>>>>> whatever the `age` variable contains. (And if there is no > `age`-named > >>>>>> object, you get an error at the time of the call to `myfun`. > >>>>> > >>>>> > >>>>> Actually, no, which was very surprising to me but John's code worked > (not > >>>>> the function, the call). And with the change I've proposed, it worked > >>>>> flawlessly. No errors. Why I don't know. > >>> > >>> See ?substitute and in particular the example highlighted there. > >>> > >>> The technical details are explained in the R Language Definition > >>> manual. The key here is the use of promises for lay evaluations. In > >>> fact, the expression in the call *is* available within the functions, > >>> as is (a pointer to) the environment in which to evaluate the > >>> expression. That is how substitute() works. Specifically, quoting from > >>> the manual, > >>> > >>> ***** > >>> It is possible to access the actual (not default) expressions used as > >>> arguments inside the function. The mechanism is implemented via > >>> promises. When a function is being evaluated the actual expression > >>> used as an argument is stored in the promise together with a pointer > >>> to the environment the function was called from. When (if) the > >>> argument is evaluated the stored expression is evaluated in the > >>> environment that the function was called from. Since only a pointer to > >>> the environment is used any changes made to that environment will be > >>> in effect during this evaluation. The resulting value is then also > >>> stored in a separate spot in the promise. Subsequent evaluations > >>> retrieve this stored value (a second evaluation is not carried out). > >>> Access to the unevaluated expression is also available using > >>> substitute. > >>> ******** > >>> > >>> -- Bert > >>> > >>> > >>> > >>> > >>>>> > >>>>> Rui Barradas > >>>>> > >>>>> You need either to call it as: > >>>>>> > >>>>>> > >>>>>> myfun( mydf , "age") > >>>>>> > >>>>>> > >>>>>> # Or: > >>>>>> > >>>>>> age <- "age" > >>>>>> myfun( mydf, age) > >>>>>> > >>>>>> Unless your value of the `age`-named variable was "age" in the > calling > >>>>>> environment (and you did not give us that value in either of your > postings), > >>>>>> you would fail. > >>>>>> > >>>>> > >>>>> ______________________________________________ > >>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>> PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > >>>>> and provide commented, minimal, self-contained, reproducible code. > > Confidentiality Statement: > This email message, including any attachments, is for ...{{dropped:16}} ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.