Michael's idea has an interesting bonus that he and I discussed earlier. It would be very convenient to have a container of key/value pairs. I imagine many people often write this:
x - mapply( names(x), x, FUN=function(k,v) { # work with key and value } especially ex perl people accustomed to while ( ($key, $value) = each( some_hash ) { } Perhaps there is room for additional discussion of using lists of SYMSXPs in this manner. (If SYMSXPs are not that safe, perhaps a looping construct for named vectors that gave the illusion iterating over a list of two-tuples.) Pete ____________________ Peter M. Haverty, Ph.D. Genentech, Inc. phave...@gene.com On Thu, Jan 8, 2015 at 11:57 AM, <luke-tier...@uiowa.edu> wrote: > On Thu, 8 Jan 2015, Michael Lawrence wrote: > > If we do add an argument to get(), then it should be named consistently >> with the ifnotfound argument of mget(). As mentioned, the possibility of a >> NULL value is problematic. One solution is a sentinel value that indicates >> an unbound value (like R_UnboundValue). >> > > A null default is fine -- it's a default; if it isn't right for a > particular case you can provide something else. > > >> But another idea (and one pretty similar to John's) is to follow the >> SYMSXP >> design at the C level, where there is a structure that points to the name >> and a value. We already have SYMSXPs at the R level of course (name >> objects) but they do not provide access to the value, which is typically >> R_UnboundValue. But this does not even need to be implemented with SYMSXP. >> The design would allow something like: >> >> binding <- getBinding("x", env) >> if (hasValue(binding)) { >> x <- value(binding) # throws an error if none >> message(name(binding), "has value", x) >> } >> >> That I think it is a bit verbose but readable and could be made fast. And >> I >> think binding objects would be useful in other ways, as they are >> essentially a "named object". For example, when iterating over an >> environment. >> > > This would need a lot more thought. Directly exposing the internals is > definitely not something we want to do as we may well want to change > that design. But there are lots of other corner issues that would have > to be thought through before going forward, such as what happens if an > rm occurs between obtaining a binding object and doing something with > it. Serialization would also need thinking through. This doesn't seem > like a worthwhile place to spend our efforts to me. > > Adding getIfExists, or .get, or get0, or whatever seems fine. Adding > an argument to get() with missing giving current behavior may be OK > too. Rewriting exists and get as .Primitives may be sufficient though. > > Best, > > luke > > > Michael >> >> >> >> >> On Thu, Jan 8, 2015 at 6:03 AM, John Nolan <jpno...@american.edu> wrote: >> >> Adding an optional argument to get (and mget) like >>> >>> val <- get(name, where, ..., value.if.not.found=NULL ) (*) >>> >>> would be useful for many. HOWEVER, it is possible that there could be >>> some confusion here: (*) can give a NULL because either x exists and >>> has value NULL, or because x doesn't exist. If that matters, the user >>> would need to be careful about specifying a value.if.not.found that >>> cannot >>> be confused with a valid value of x. >>> >>> To avoid this difficulty, perhaps we want both: have Martin's >>> getifexists( >>> ) >>> return a list with two values: >>> - a boolean variable 'found' # = value returned by exists( ) >>> - a variable 'value' >>> >>> Then implement get( ) as: >>> >>> get <- function(x,...,value.if.not.found ) { >>> >>> if( missing(value.if.not.found) ) { >>> a <- getifexists(x,... ) >>> if (!a$found) error("x not found") >>> } else { >>> a <- getifexists(x,...,value.if.not.found ) >>> } >>> return(a$value) >>> } >>> >>> Note that value.if.not.found has no default value in above. >>> It behaves exactly like current get does if value.if.not.found >>> is not specified, and if it is specified, it would be faster >>> in the common situation mentioned below: >>> if(exists(x,...)) { get(x,...) } >>> >>> John >>> >>> P.S. if you like dromedaries call it valueIfNotFound ... >>> >>> .............................................................. >>> John P. Nolan >>> Math/Stat Department >>> 227 Gray Hall, American University >>> 4400 Massachusetts Avenue, NW >>> Washington, DC 20016-8050 >>> >>> jpno...@american.edu voice: 202.885.3140 >>> web: academic2.american.edu/~jpnolan >>> .............................................................. >>> >>> >>> -----"R-devel" <r-devel-boun...@r-project.org> wrote: ----- >>> To: Martin Maechler <maech...@stat.math.ethz.ch>, R-devel@r-project.org >>> From: Duncan Murdoch >>> Sent by: "R-devel" >>> Date: 01/08/2015 06:39AM >>> Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...} >>> >>> On 08/01/2015 4:16 AM, Martin Maechler wrote: >>> > In November, we had a "bug repository conversation" >>> > with Peter Hagerty and myself: >>> > >>> > https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065 >>> > >>> > where the bug report title started with >>> > >>> > --->> "exists" is a bottleneck for dispatch and package loading, ... >>> > >>> > Peter proposed an extra simplified and henc faster version of exists(), >>> > and I commented >>> > >>> > > --- Comment #2 from Martin Maechler <maech...@stat.math.ethz.ch> >>> --- >>> > > I'm very grateful that you've started exploring the bottlenecks >>> of >>> loading >>> > > packages with many S4 classes (and methods)... >>> > > and I hope we can make real progress there rather sooner than >>> later. >>> > >>> > > OTOH, your `summaryRprof()` in your vignette indicates that >>> exists() may use >>> > > upto 10% of the time spent in library(reportingTools), and your >>> speedup >>> > > proposals of exist() may go up to ca 30% which is good and well >>> worth >>> > > considering, but still we can only expect 2-3% speedup for >>> package loading >>> > > which unfortunately is not much. >>> > >>> > > Still I agree it is worth looking at exists() as you did ... and >>> > > consider providing a fast simplified version of it in addition to >>> current >>> > > exists() [I think]. >>> > >>> > > BTW, as we talk about enhancements here, maybe consider a further >>> possibility: >>> > > My subjective guess is that probably more than half of exists() >>> uses are of the >>> > > form >>> > >>> > > if(exists(name, where, .......)) { >>> > > get(name, whare, ....) >>> > > .. >>> > > } else { >>> > > NULL / error() / .. or similar >>> > > } >>> > >>> > > i.e. many exists() calls when returning TRUE are immediately >>> followed by the >>> > > corresponding get() call which repeats quite a bit of the lookup >>> that exists() >>> > > has done. >>> > >>> > > Instead, I'd imagine a function, say getifexists(name, ...) that >>> does both at >>> > > once in the "exists is TRUE" case but in a way we can easily keep >>> the if(.) .. >>> > > else clause above. One already existing approach would use >>> > >>> > > if(!inherits(tryCatch(xx <- get(name, where, ...), >>> error=function(e)e), "error")) { >>> > >>> > > ... (( work with xx )) ... >>> > >>> > > } else { >>> > > NULL / error() / .. or similar >>> > > } >>> > >>> > > but of course our C implementation would be more efficient and >>> use >>> more concise >>> > > syntax {which should not look like error handling}. Follow ups >>> to this idea >>> > > should really go to R-devel (the mailing list). >>> > >>> > and now I do follow up here myself : >>> > >>> > I found that 'getifexists()' is actually very simple to implement, >>> > I have already tested it a bit, but not yet committed to R-devel >>> > (the "R trunk" aka "master branch") because I'd like to get >>> > public comments {RFC := Request For Comments}. >>> > >>> >>> I don't like the name -- I'd prefer getIfExists. As Baath (2012, R >>> Journal) pointed out, R names are very inconsistent in naming >>> conventions, but lowerCamelCase is the most common choice. Second most >>> common is period.separated, so an argument could be made for >>> get.if.exists, but there's still the possibility of confusion with S3 >>> methods, and users of other languages where "." is an operator find it a >>> little strange. >>> >>> If you don't like lowerCamelCase (and a lot of people don't), then I >>> think underscore_separated is the next best choice, so would use >>> get_if_exists. >>> >>> Another possibility is to make no new name at all, and just add an >>> optional parameter to get() (which if present acts as your value.if.not >>> parameter, if not present keeps the current "object not found" error). >>> >>> Duncan Murdoch >>> >>> >>> > My version of the help file {for both exists() and getifexists()} >>> > rendered in text is >>> > >>> > ---------------------- help(getifexists) ------------------------------ >>> - >>> > Is an Object Defined? >>> > >>> > Description: >>> > >>> > Look for an R object of the given name and possibly return it >>> > >>> > Usage: >>> > >>> > exists(x, where = -1, envir = , frame, mode = "any", >>> > inherits = TRUE) >>> > >>> > getifexists(x, where = -1, envir = as.environment(where), >>> > mode = "any", inherits = TRUE, value.if.not = NULL) >>> > >>> > Arguments: >>> > >>> > x: a variable name (given as a character string). >>> > >>> > where: where to look for the object (see the details section); if >>> > omitted, the function will search as if the name of the >>> > object appeared unquoted in an expression. >>> > >>> > envir: an alternative way to specify an environment to look in, but >>> > it is usually simpler to just use the 'where' argument. >>> > >>> > frame: a frame in the calling list. Equivalent to giving 'where' as >>> > 'sys.frame(frame)'. >>> > >>> > mode: the mode or type of object sought: see the 'Details' section. >>> > >>> > inherits: should the enclosing frames of the environment be searched? >>> > >>> > value.if.not: the return value of 'getifexists(x, *)' when 'x' does not >>> > exist. >>> > >>> > Details: >>> > >>> > The 'where' argument can specify the environment in which to look >>> > for the object in any of several ways: as an integer (the position >>> > in the 'search' list); as the character string name of an element >>> > in the search list; or as an 'environment' (including using >>> > 'sys.frame' to access the currently active function calls). The >>> > 'envir' argument is an alternative way to specify an environment, >>> > but is primarily there for back compatibility. >>> > >>> > This function looks to see if the name 'x' has a value bound to it >>> > in the specified environment. If 'inherits' is 'TRUE' and a value >>> > is not found for 'x' in the specified environment, the enclosing >>> > frames of the environment are searched until the name 'x' is >>> > encountered. See 'environment' and the 'R Language Definition' >>> > manual for details about the structure of environments and their >>> > enclosures. >>> > >>> > *Warning:* 'inherits = TRUE' is the default behaviour for R but >>> > not for S. >>> > >>> > If 'mode' is specified then only objects of that type are sought. >>> > The 'mode' may specify one of the collections '"numeric"' and >>> > '"function"' (see 'mode'): any member of the collection will >>> > suffice. (This is true even if a member of a collection is >>> > specified, so for example 'mode = "special"' will seek any type of >>> > function.) >>> > >>> > Value: >>> > >>> > 'exists():' Logical, true if and only if an object of the correct >>> > name and mode is found. >>> > >>> > 'getifexists():' The object-as from 'get(x, *)'- if 'exists(x, *)' >>> > is true, otherwise 'value.if.not'. >>> > >>> > Note: >>> > >>> > With 'getifexists()', instead of the easy to read but somewhat >>> > inefficient >>> > >>> > if (exists(myVarName, envir = myEnvir)) { >>> > r <- get(myVarName, envir = myEnvir) >>> > ## ... deal with r ... >>> > } >>> > >>> > you now can use the more efficient (and slightly harder to read) >>> > >>> > if (!is.null(r <- getifexists(myVarName, envir = myEnvir))) { >>> > ## ... deal with r ... >>> > } >>> > >>> > References: >>> > >>> > Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S >>> > Language_. Wadsworth & Brooks/Cole. >>> > >>> > See Also: >>> > >>> > 'get'. For quite a different kind of "existence" checking, namely >>> > if function arguments were specified, 'missing'; and for yet a >>> > different kind, namely if a file exists, 'file.exists'. >>> > >>> > Examples: >>> > >>> > ## Define a substitute function if necessary: >>> > if(!exists("some.fun", mode = "function")) >>> > some.fun <- function(x) { cat("some.fun(x)\n"); x } >>> > search() >>> > exists("ls", 2) # true even though ls is in pos = 3 >>> > exists("ls", 2, inherits = FALSE) # false >>> > >>> > ## These are true (in most circumstances): >>> > identical(ls, getifexists("ls")) >>> > identical(NULL, getifexists(".foo.bar.")) # default value.if.not = >>> NULL(!) >>> > >>> > ----------------- end[ help(getifexists) ] >>> ----------------------------- >>> > >>> > ______________________________________________ >>> > R-devel@r-project.org mailing list >>> > https://stat.ethz.ch/mailman/listinfo/r-devel >>> > >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >>> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > -- > Luke Tierney > Ralph E. Wareham Professor of Mathematical Sciences > University of Iowa Phone: 319-335-3386 > Department of Statistics and Fax: 319-335-3017 > Actuarial Science > 241 Schaeffer Hall email: luke-tier...@uiowa.edu > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel