Re: [R] problem with lapply(x, subset, ...) and variable select argument
Just one simple shortening of DR's solution: tt <- function (n) { x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4)) print(sapply(x, function(...) subset(...), select = n)) } n <- "b" tt("a") On 10/11/05, Dimitris Rizopoulos <[EMAIL PROTECTED]> wrote: > As Gabor said, the issue here is that subset.data.frame() evaluates > the value of the `select' argument in the parent.frame(); Thus, if you > create a local function within lapply() (or sapply()) it works: > > tt <- function (n) { >x <- list(data.frame(a = 1, b = 2), data.frame(a = 3, b = 4)) >print(lapply(x, function(y, n) subset(y, select = n), n = n)) >print(sapply(x, function(y, n) subset(y, select = n), n = n)) > } > > tt("a") > > > I hope it helps. > > Best, > Dimitris > > > Dimitris Rizopoulos > Ph.D. Student > Biostatistical Centre > School of Public Health > Catholic University of Leuven > > Address: Kapucijnenvoer 35, Leuven, Belgium > Tel: +32/(0)16/336899 > Fax: +32/(0)16/337015 > Web: http://www.med.kuleuven.be/biostat/ > http://www.student.kuleuven.be/~m0390867/dimitris.htm > > > > - Original Message - > From: "joerg van den hoff" <[EMAIL PROTECTED]> > To: "Gabor Grothendieck" <[EMAIL PROTECTED]>; "Thomas Lumley" > <[EMAIL PROTECTED]> > Cc: "r-help" > Sent: Tuesday, October 11, 2005 10:18 AM > Subject: Re: [R] problem with lapply(x, subset,...) and variable > select argument > > > > Gabor Grothendieck wrote: > >> The problem is that subset looks into its parent frame but in this > >> case the parent frame is not the environment in tt but the > >> environment > >> in lapply since tt does not call subset directly but rather lapply > >> does. > >> > >> Try this which is similar except we have added the line beginning > >> with environment before the print statement. > >> > >> tt <- function (n) { > >>x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4)) > >>environment(lapply) <- environment() > >>print(lapply(x, subset, select = n)) > >> } > >> > >> n <- "b" > >> tt("a") > >> > >> What this does is create a new version of lapply whose > >> parent is the environment in tt. > >> > >> > >> On 10/10/05, joerg van den hoff <[EMAIL PROTECTED]> > >> wrote: > >> > >>>I need to extract identically named columns from several data > >>>frames in > >>>a list. the column name is a variable (i.e. not known in advance). > >>>the > >>>whole thing occurs within a function body. I'd like to use lapply > >>>with a > >>>variable 'select' argument. > >>> > >>> > >>>example: > >>> > >>>tt <- function (n) { > >>> x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4)) > >>> for (xx in x) print(subset(xx, select = n)) ### works > >>> print (lapply(x, subset, select = a)) ### works > >>> print (lapply(x, subset, select = "a")) ### works > >>> print (lapply(x, subset, select = n)) ### does not work as > >>> intended > >>>} > >>>n = "b" > >>>tt("a") #works (but selects not the intended column) > >>>rm(n) > >>>tt("a") #no longer works in the lapply call including variable > >>>'n' > >>> > >>> > >>>question: how can I enforce evaluation of the variable n such that > >>>the lapply call works? I suspect it has something to do with eval > >>>and > >>>specifying the correct evaluation frame, but how? > >>> > >>> > >>>many thanks > >>> > >>>joerg > >>> > >>>__ > >>>R-help@stat.math.ethz.ch mailing list > >>>https://stat.ethz.ch/mailman/listinfo/r-help > >>>PLEASE do read the posting guide! > >>>http://www.R-project.org/posting-guide.html > >>> > >> > >> > > > > many thanks to thomas and gabor for their help. both solutions solve > > my > > problem perfectly. > > > > but just as an attempt to improve my understanding of the inner > > workings > > of R (similar problems are sure to come up ...) two more question: > > >
Re: [R] problem with lapply(x, subset, ...) and variable select argument
On Tue, 11 Oct 2005, joerg van den hoff wrote: > many thanks to thomas and gabor for their help. both solutions solve my > problem perfectly. > > but just as an attempt to improve my understanding of the inner workings of R > (similar problems are sure to come up ...) two more question: > > 1. > why does the call of the "[" function (thomas' solution) behave different > from "subset" in that the look up of the variable "n" works without providing > lapply with the current environment (which is nice)? "[" behaves like nearly all functions in R: the value of the argument is passed. subset() does some tricky things to subvert the usual argument passing. Quite a few of the modelling functions do similar tricky things, and they do sometimes get confused when passed as arguments to another function. > 2. > using 'subset' in this context becomes more cumbersome, if sapply is used. it > seems that than I need > ... > environment(sapply) <- environment(lapply) <- environment() > sapply(x, subset, select = n)) > ... > to get it working (and that means you must know, that sapply uses lapply). or > can I somehow avoid the additional explicit definition of the > lapply-environment? You really don't want to go around playing with environment() on functions. That way lies madness. Use subset at the command line and [ or [[ in programming. I don't think I have ever set environment() on a function (only on formulas). -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problem with lapply(x, subset, ...) and variable select argument
"Dimitris Rizopoulos" <[EMAIL PROTECTED]> writes: > As Gabor said, the issue here is that subset.data.frame() evaluates > the value of the `select' argument in the parent.frame(); Thus, if you > create a local function within lapply() (or sapply()) it works: It's more complicated than that: It evaluates the select argument in a named list with names duplicating those of the data frame, and *then* in parent.frame. This is convenient for command line use, because you can specify ranges of variables as in dfsub <- subset(dfr,select=c(sex:treat, x_pre:x_24)) but it is quite risky to try and do this inside a function - if you're passing in a variable, the result depends on whether there is a variable of the same name in the data frame! You can probably get around it using substitute() constructions, but I think it is safer to avoid using functions with nonstandard semantics inside functions. > tt <- function (n) { > x <- list(data.frame(a = 1, b = 2), data.frame(a = 3, b = 4)) > print(lapply(x, function(y, n) subset(y, select = n), n = n)) > print(sapply(x, function(y, n) subset(y, select = n), n = n)) > } > > tt("a") > > > I hope it helps. > > Best, > Dimitris > > > Dimitris Rizopoulos > Ph.D. Student > Biostatistical Centre > School of Public Health > Catholic University of Leuven > > Address: Kapucijnenvoer 35, Leuven, Belgium > Tel: +32/(0)16/336899 > Fax: +32/(0)16/337015 > Web: http://www.med.kuleuven.be/biostat/ > http://www.student.kuleuven.be/~m0390867/dimitris.htm > > > > - Original Message - > From: "joerg van den hoff" <[EMAIL PROTECTED]> > To: "Gabor Grothendieck" <[EMAIL PROTECTED]>; "Thomas Lumley" > <[EMAIL PROTECTED]> > Cc: "r-help" > Sent: Tuesday, October 11, 2005 10:18 AM > Subject: Re: [R] problem with lapply(x, subset,...) and variable > select argument > > > > Gabor Grothendieck wrote: > >> The problem is that subset looks into its parent frame but in this > >> case the parent frame is not the environment in tt but the > >> environment > >> in lapply since tt does not call subset directly but rather lapply > >> does. > >> > >> Try this which is similar except we have added the line beginning > >> with environment before the print statement. > >> > >> tt <- function (n) { > >>x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4)) > >>environment(lapply) <- environment() > >>print(lapply(x, subset, select = n)) > >> } > >> > >> n <- "b" > >> tt("a") > >> > >> What this does is create a new version of lapply whose > >> parent is the environment in tt. > >> > >> > >> On 10/10/05, joerg van den hoff <[EMAIL PROTECTED]> > >> wrote: > >> > >>>I need to extract identically named columns from several data > >>>frames in > >>>a list. the column name is a variable (i.e. not known in advance). > >>>the > >>>whole thing occurs within a function body. I'd like to use lapply > >>>with a > >>>variable 'select' argument. > >>> > >>> > >>>example: > >>> > >>>tt <- function (n) { > >>> x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4)) > >>> for (xx in x) print(subset(xx, select = n)) ### works > >>> print (lapply(x, subset, select = a)) ### works > >>> print (lapply(x, subset, select = "a")) ### works > >>> print (lapply(x, subset, select = n)) ### does not work as > >>> intended > >>>} > >>>n = "b" > >>>tt("a") #works (but selects not the intended column) > >>>rm(n) > >>>tt("a") #no longer works in the lapply call including variable > >>>'n' > >>> > >>> > >>>question: how can I enforce evaluation of the variable n such that > >>>the lapply call works? I suspect it has something to do with eval > >>>and > >>>specifying the correct evaluation frame, but how? > >>> > >>> > >>>many thanks > >>> > >>>joerg > >>> > >>>__ > >>>R-help@stat.math.ethz.ch mailing list > >>>https://stat.ethz.ch/mailman/listinfo/r-help > >>>PLEAS
Re: [R] problem with lapply(x, subset, ...) and variable select argument
As Gabor said, the issue here is that subset.data.frame() evaluates the value of the `select' argument in the parent.frame(); Thus, if you create a local function within lapply() (or sapply()) it works: tt <- function (n) { x <- list(data.frame(a = 1, b = 2), data.frame(a = 3, b = 4)) print(lapply(x, function(y, n) subset(y, select = n), n = n)) print(sapply(x, function(y, n) subset(y, select = n), n = n)) } tt("a") I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://www.med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: "joerg van den hoff" <[EMAIL PROTECTED]> To: "Gabor Grothendieck" <[EMAIL PROTECTED]>; "Thomas Lumley" <[EMAIL PROTECTED]> Cc: "r-help" Sent: Tuesday, October 11, 2005 10:18 AM Subject: Re: [R] problem with lapply(x, subset,...) and variable select argument > Gabor Grothendieck wrote: >> The problem is that subset looks into its parent frame but in this >> case the parent frame is not the environment in tt but the >> environment >> in lapply since tt does not call subset directly but rather lapply >> does. >> >> Try this which is similar except we have added the line beginning >> with environment before the print statement. >> >> tt <- function (n) { >>x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4)) >>environment(lapply) <- environment() >>print(lapply(x, subset, select = n)) >> } >> >> n <- "b" >> tt("a") >> >> What this does is create a new version of lapply whose >> parent is the environment in tt. >> >> >> On 10/10/05, joerg van den hoff <[EMAIL PROTECTED]> >> wrote: >> >>>I need to extract identically named columns from several data >>>frames in >>>a list. the column name is a variable (i.e. not known in advance). >>>the >>>whole thing occurs within a function body. I'd like to use lapply >>>with a >>>variable 'select' argument. >>> >>> >>>example: >>> >>>tt <- function (n) { >>> x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4)) >>> for (xx in x) print(subset(xx, select = n)) ### works >>> print (lapply(x, subset, select = a)) ### works >>> print (lapply(x, subset, select = "a")) ### works >>> print (lapply(x, subset, select = n)) ### does not work as >>> intended >>>} >>>n = "b" >>>tt("a") #works (but selects not the intended column) >>>rm(n) >>>tt("a") #no longer works in the lapply call including variable >>>'n' >>> >>> >>>question: how can I enforce evaluation of the variable n such that >>>the lapply call works? I suspect it has something to do with eval >>>and >>>specifying the correct evaluation frame, but how? >>> >>> >>>many thanks >>> >>>joerg >>> >>>__ >>>R-help@stat.math.ethz.ch mailing list >>>https://stat.ethz.ch/mailman/listinfo/r-help >>>PLEASE do read the posting guide! >>>http://www.R-project.org/posting-guide.html >>> >> >> > > many thanks to thomas and gabor for their help. both solutions solve > my > problem perfectly. > > but just as an attempt to improve my understanding of the inner > workings > of R (similar problems are sure to come up ...) two more question: > > 1. > why does the call of the "[" function (thomas' solution) behave > different from "subset" in that the look up of the variable "n" > works > without providing lapply with the current environment (which is > nice)? > > 2. > using 'subset' in this context becomes more cumbersome, if sapply is > used. it seems that than I need > ... > environment(sapply) <- environment(lapply) <- environment() > sapply(x, subset, select = n)) > ... > to get it working (and that means you must know, that sapply uses > lapply). or can I somehow avoid the additional explicit definition > of > the lapply-environment? > > > again: many thanks > > joerg > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problem with lapply(x, subset, ...) and variable select argument
Gabor Grothendieck wrote: > The problem is that subset looks into its parent frame but in this > case the parent frame is not the environment in tt but the environment > in lapply since tt does not call subset directly but rather lapply does. > > Try this which is similar except we have added the line beginning > with environment before the print statement. > > tt <- function (n) { >x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4)) >environment(lapply) <- environment() >print(lapply(x, subset, select = n)) > } > > n <- "b" > tt("a") > > What this does is create a new version of lapply whose > parent is the environment in tt. > > > On 10/10/05, joerg van den hoff <[EMAIL PROTECTED]> wrote: > >>I need to extract identically named columns from several data frames in >>a list. the column name is a variable (i.e. not known in advance). the >>whole thing occurs within a function body. I'd like to use lapply with a >>variable 'select' argument. >> >> >>example: >> >>tt <- function (n) { >> x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4)) >> for (xx in x) print(subset(xx, select = n)) ### works >> print (lapply(x, subset, select = a)) ### works >> print (lapply(x, subset, select = "a")) ### works >> print (lapply(x, subset, select = n)) ### does not work as intended >>} >>n = "b" >>tt("a") #works (but selects not the intended column) >>rm(n) >>tt("a") #no longer works in the lapply call including variable 'n' >> >> >>question: how can I enforce evaluation of the variable n such that >>the lapply call works? I suspect it has something to do with eval and >>specifying the correct evaluation frame, but how? >> >> >>many thanks >> >>joerg >> >>__ >>R-help@stat.math.ethz.ch mailing list >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >> > > many thanks to thomas and gabor for their help. both solutions solve my problem perfectly. but just as an attempt to improve my understanding of the inner workings of R (similar problems are sure to come up ...) two more question: 1. why does the call of the "[" function (thomas' solution) behave different from "subset" in that the look up of the variable "n" works without providing lapply with the current environment (which is nice)? 2. using 'subset' in this context becomes more cumbersome, if sapply is used. it seems that than I need ... environment(sapply) <- environment(lapply) <- environment() sapply(x, subset, select = n)) ... to get it working (and that means you must know, that sapply uses lapply). or can I somehow avoid the additional explicit definition of the lapply-environment? again: many thanks joerg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problem with lapply(x, subset, ...) and variable select argument
The problem is that subset looks into its parent frame but in this case the parent frame is not the environment in tt but the environment in lapply since tt does not call subset directly but rather lapply does. Try this which is similar except we have added the line beginning with environment before the print statement. tt <- function (n) { x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4)) environment(lapply) <- environment() print(lapply(x, subset, select = n)) } n <- "b" tt("a") What this does is create a new version of lapply whose parent is the environment in tt. On 10/10/05, joerg van den hoff <[EMAIL PROTECTED]> wrote: > I need to extract identically named columns from several data frames in > a list. the column name is a variable (i.e. not known in advance). the > whole thing occurs within a function body. I'd like to use lapply with a > variable 'select' argument. > > > example: > > tt <- function (n) { >x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4)) >for (xx in x) print(subset(xx, select = n)) ### works >print (lapply(x, subset, select = a)) ### works >print (lapply(x, subset, select = "a")) ### works >print (lapply(x, subset, select = n)) ### does not work as intended > } > n = "b" > tt("a") #works (but selects not the intended column) > rm(n) > tt("a") #no longer works in the lapply call including variable 'n' > > > question: how can I enforce evaluation of the variable n such that > the lapply call works? I suspect it has something to do with eval and > specifying the correct evaluation frame, but how? > > > many thanks > > joerg > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problem with lapply(x, subset, ...) and variable select argument
On Mon, 10 Oct 2005, joerg van den hoff wrote: > I need to extract identically named columns from several data frames in > a list. the column name is a variable (i.e. not known in advance). the > whole thing occurs within a function body. I'd like to use lapply with a > variable 'select' argument. You would probably be better off using "[" rather than subset(). tt <- function (n) { x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4)) print(lapply(x,"[",n)) } seems to do what you want. -thomas > example: > > tt <- function (n) { >x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4)) >for (xx in x) print(subset(xx, select = n)) ### works >print (lapply(x, subset, select = a)) ### works >print (lapply(x, subset, select = "a")) ### works >print (lapply(x, subset, select = n)) ### does not work as intended > } > n = "b" > tt("a") #works (but selects not the intended column) > rm(n) > tt("a") #no longer works in the lapply call including variable 'n' > > > question: how can I enforce evaluation of the variable n such that > the lapply call works? I suspect it has something to do with eval and > specifying the correct evaluation frame, but how? > > > many thanks > > joerg > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] problem with lapply(x, subset, ...) and variable select argument
I need to extract identically named columns from several data frames in a list. the column name is a variable (i.e. not known in advance). the whole thing occurs within a function body. I'd like to use lapply with a variable 'select' argument. example: tt <- function (n) { x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4)) for (xx in x) print(subset(xx, select = n)) ### works print (lapply(x, subset, select = a)) ### works print (lapply(x, subset, select = "a")) ### works print (lapply(x, subset, select = n)) ### does not work as intended } n = "b" tt("a") #works (but selects not the intended column) rm(n) tt("a") #no longer works in the lapply call including variable 'n' question: how can I enforce evaluation of the variable n such that the lapply call works? I suspect it has something to do with eval and specifying the correct evaluation frame, but how? many thanks joerg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html