The other thing you need to be aware of it you're using the other approach is partial matching:
df <- data.frame(xyz = 1) is.null(df$x) #> [1] FALSE Duncan - I think that argues for including a has_name() (hasName() ?) function in base R. Is that something you'd consider? Hadley On Mon, Jun 27, 2016 at 10:05 AM, Lenth, Russell V <russell-le...@uiowa.edu> wrote: > Thanks, Hadley. I do understand why you'd want more careful checking. > > If you're going to provide a variable-existing function, may I suggest a > short name like 'has'? I.e., has(x, var) returns TRUE if x has var in it. > > Thanks > > Russ > >> On Jun 27, 2016, at 9:47 AM, Hadley Wickham <h.wick...@gmail.com> wrote: >> >> On Mon, Jun 27, 2016 at 9:03 AM, Duncan Murdoch >> <murdoch.dun...@gmail.com> wrote: >>> On 27/06/2016 9:22 AM, Lenth, Russell V wrote: >>>> >>>> My package 'lsmeans' is now suddenly broken because of a new provision in >>>> the 'tibble' package (loaded by 'dplyr' 0.5.0), whereby the "[[" and "$" >>>> methods for 'tbl_df' objects - as documented - throw an error if a variable >>>> is not found. >>>> >>>> The problem is that my code uses tests like this: >>>> >>>> if (is.null (x$var)) {...} >>>> >>>> to see whether 'x' has a variable 'var'. Obviously, I can work around this >>>> using >>>> >>>> if (!("var" %in% names(x))) {...} >>>> >>>> but (a) I like the first version better, in terms of the code being >>>> understandable; and (b) isn't there a long history whereby we can expect a >>>> NULL result when accessing an absent member of a list (and hence a >>>> data.frame)? (c) the code base for 'lsmeans' has about 50 instances of such >>>> tests. >>>> >>>> Anyway, I wonder if a lot of other package developers test for absent >>>> variables in that first way; if so, they too are in for a rude awakening if >>>> their users provide a tbl_df instead of a data.frame. And what is >>>> considered >>>> the best practice for testing absence of a list member? Apparently, not >>>> either of the above; and because of (c), I want to do these many tedious >>>> corrections only once. >>>> >>>> Thanks for any light you can shed. >>> >>> >>> This is why CRAN asks that people test reverse dependencies. >> >> Which we did do - the problem is that this is actually caused by a >> recursive reverse dependency (lsmeans -> dplyr -> tibble), and we >> didn't correctly anticipate how much pain this would cause. >> >>> I think the most defensive thing you can do is to write a small function >>> >>> name_missing <- function(x, name) >>> !(name %in% names(x)) >>> >>> and use name_missing(x, "var") in your tests. (Pick your own name to make >>> your code understandable if you don't like my choice.) >>> >>> You could suggest to the tibble maintainers that they add a function like >>> this. >> >> We're definitely going to add this. >> >> And I think we'll make df[["var"]] return NULL too, so at least >> there's one easy way to opt out. >> >> The motivation for this change was that returning NULL + recycling >> rules means it's very easy for errors to silently propagate. But I >> think this approach might be somewhat too aggressive - I hadn't >> considered that people use `is.null()` to check for missing columns. >> >> We'll try and get an update to tibble out soon after useR. Thoughts >> on what we should do are greatly appreciated. >> >> Hadley >> >> -- >> http://hadley.nz -- http://hadley.nz ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel