2011/4/1 Nick Sabbe <nick.sa...@ugent.be>: > This should be a version that does what you want.
Indeed it does, thank you very much! > Because you named the variable lvarname, I assumed you were already passing > "lvar" instead of trying to pass lvar (without the quotes), which is in no > way a 'name'. Sorry about that, I can see how my variable names were somewhat confusing. Many thanks once again! > > > > -----Original Message----- > From: irene.p...@googlemail.com [mailto:irene.p...@googlemail.com] On Behalf > Of E Hofstadler > Sent: vrijdag 1 april 2011 14:28 > To: Nick Sabbe > Cc: r-help@r-project.org > Subject: Re: [R] programming: telling a function where to look for the > entered variables > > Thanks Nick and Juan for your replies. > > Nick, thanks for pointing out the warning in subset(). I'm not sure > though I understand the example you provided -- because despite using > subset() rather than bracket notation, the original function (myfunct) > does what is expected of it. The problem I have is with the second > function (myfunct.better), where variable names + dataframe are not > fixed within the function but passed to the function when calling it > -- and even with bracket notation I don't quite manage to tell R where > to look for the columns that related to the entered column names. > (but then perhaps I misunderstood you) > > This is what I tried (using bracket notation): > > myfunct.better(dataframe, subgroup, lvarname,yvarname){ > Data.tmp <- dataframe[dataframe[,deparse(substitute(lvarname))]==subgroup, > c("xvar",deparse(substitute(yvarname)))] > } > > but this creates an empty contingency table only -- perhaps because my > use of deparse() is flawed (I think what is converted into a string is > "lvarname" and "yvarname", rather than the column names that these two > function-variables represent in the dataframe)? > > > 2011/4/1 Nick Sabbe <nick.sa...@ugent.be>: >> See the warning in ?subset. >> Passing the column name of lvar is not the same as passing the 'contextual >> column' (as I coin it in these circumstances). >> You can solve it by indeed using [] instead. >> >> For my own comfort, here is the relevant line from your original function: >> Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar")) >> Which should become something like (untested but should be close): >> Data.tmp <- Fulldf[Fulldf[,"lvar"]==subgroup, c("xvar","yvar")] >> >> This should be a lot easier to translate based on column names, as the >> column names are now used as such. >> >> HTH, >> >> >> Nick Sabbe >> -- >> ping: nick.sa...@ugent.be >> link: http://biomath.ugent.be >> wink: A1.056, Coupure Links 653, 9000 Gent >> ring: 09/264.59.36 >> >> -- Do Not Disapprove >> >> >> >> >> -----Original Message----- >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On >> Behalf Of E Hofstadler >> Sent: vrijdag 1 april 2011 13:09 >> To: r-help@r-project.org >> Subject: [R] programming: telling a function where to look for the entered >> variables >> >> Hi there, >> >> Could someone help me with the following programming problem..? >> >> I have written a function that works for my intended purpose, but it >> is quite closely tied to a particular dataframe and the names of the >> variables in this dataframe. However, I'd like to use the same >> function for different dataframes and variables. My problem is that >> I'm not quite sure how to tell my function in which dataframe the >> entered variables are located. >> >> Here's some reproducible data and the function: >> >> # create reproducible data >> set.seed(124) >> xvar <- sample(0:3, 1000, replace = T) >> yvar <- sample(0:1, 1000, replace=T) >> zvar <- rnorm(100) >> lvar <- sample(0:1, 1000, replace=T) >> Fulldf <- as.data.frame(cbind(xvar,yvar,zvar,lvar)) >> Fulldf$xvar <- factor(xvar, labels=c("blue","green","red","yellow")) >> Fulldf$yvar <- factor(yvar, labels=c("area1","area2")) >> Fulldf$lvar <- factor(lvar, labels=c("yes","no")) >> >> and here's the function in the form that it currently works: from a >> subset of the dataframe Fulldf, a contingency table is created (in my >> actual data, several other operations are then performed on that >> contingency table, but these are not relevant for the problem in >> question, therefore I've deleted it) . >> >> # function as it currently works: tailored to a particular dataframe >> (Fulldf) >> >> myfunct <- function(subgroup){ # enter a particular subgroup for which >> the contingency table should be calculated (i.e. a particular value of >> the factor lvar) >> Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar")) >> #restrict dataframe to given subgroup and two columns of the original >> dataframe >> Data.tmp <- na.omit(Data.tmp) # exclude missing values >> indextable <- table(Data.tmp$xvar, Data.tmp$yvar) # make contingency table >> return(indextable) >> } >> >> #Since I need to use the function with different dataframes and >> variable names, I'd like to be able to tell my function the name of >> the dataframe and variables it should use for calculating the index. >> This is how I tried to modify the first part of the #function, but it >> didn't work: >> >> # function as I would like it to work: independent of any particular >> dataframe or variable names (doesn't work) >> >> myfunct.better <- function(subgroup, lvarname, yvarname, dataframe){ >> #enter the subgroup, the variable names to be used and the dataframe >> in which they are found >> Data.tmp <- subset(dataframe, lvarname==subgroup, select=c("xvar", >> deparse(substitute(yvarname)))) # trying to subset the given dataframe >> for the given subgroup of the given variable. The variable "xvar" >> happens to have the same name in all dataframes) but the variable >> yvarname has different names in the different dataframes >> Data.tmp <- na.omit(Data.tmp) >> indextable <- table(Data.tmp$xvar, Data.tmp$yvarname) # create the >> contingency table on the basis of the entered variables >> return(indextable) >> } >> >> calling >> >> myfunct.better("yes", lvarname=lvar, yvarname=yvar, dataframe=Fulldf) >> >> results in the following error: >> >> Error in `[.data.frame`(x, r, vars, drop = drop) : >> undefined columns selected >> >> My feeling is that R doesn't know where to look for the entered >> variables (lvar, yvar), but I'm not sure how to solve this problem. I >> tried using with() and even attach() within the function, but that >> didn't work. >> >> Any help is greatly appreciated. >> >> Best, >> Esther >> >> P.S.: >> Are there books that elaborate programming in R for beginners -- and I >> mean things like how to best use vectorization instead of loops and >> general "best practice" tips for programming. Most of the books I've >> been looking at focus on applying R for particular statistical >> analyses, and only comparably briefly deal with more general >> programming aspects. I was wondering if there's any books or tutorials >> out there that cover the latter aspects in a more elaborate and >> systematic way...? >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.