Hi Martin, John, Thanks for the responses! I've tidied up some of the notes from this mailing list thread and posted them on the bug tracker.
John, in this case, I think namespaces are relevant because for non-exported S4 classes, the class information is made available through the '.__C__<package>' symbol in the package's namespace, but not the package environment that gets attached to the search path. In this (rare, yet not impossible) sequence of events, it looks like R attempts to resolve the '.__C__<package>' symbol in the wrong environment, and so class information lookup fails, and we end up caching the wrong inheritance information. Thanks, Kevin On Sun, Jul 31, 2016 at 5:12 AM, John Chambers <j...@r-project.org> wrote: > (Just returning from the "wilds" of Canada, so not able to comment on the > specifics, but ...) > > There is a basic point about generic functions that may be related to the > "private" class question and my earlier remarks that Martin alluded to. > > R (and S4 before it) allows packages to define methods for a generic > function in another package. Say, for plot() in graphics. > > The current model is that the generic plot() remains one function, > specifically a generic associated with the graphics package. > > Package A might define a method corresponding to one or two classes defined > in that package. When A is loaded, those methods are added to the table for > plot() in the current session. > > Now suppose a user calls a function, foo(), in package B, and that foo() in > turn calls plot(). This is the same plot() function, and in particular will > include the methods supplied from package A. > > This is regardless of the two packages having any overt connection. Also, > the methods are accepted by the generic function regardless of whether the > class is explicitly exported or not. In this sense, classes cannot be > entirely private if methods are defined for a non-local function. Namespaces > are not directly relevant. > > Whether this can lead to strange behavior isn't clear, and if so, is it a > sign that something undesirable was done in the particular example? (In > Extending R, I suggested a "right to write methods" that would discourage a > package from having methods unless it owned the function or some of the > classes.) > > R could adopt a different model for generic functions, where a package that > defined a method for a non-exported class would create a "local" version of > the generic, but that would likely raise some other issues. > > But seems like a useful topic for discussion. > > John > > On Jul 30, 2016, at 11:07 AM, Martin Maechler <maech...@stat.math.ethz.ch> > wrote: > >>>>>>> Kevin Ushey <kevinus...@gmail.com> >>>>>>> on Fri, 29 Jul 2016 11:46:19 -0700 writes: >> >>> I should add one more item that may be related here -- >>> calling 'methods:::.requirePackage' returns a different >>> result based on whether the package namespace is already >>> loaded or not. >> >>> If the package namespace is not loaded, the package is >>> loaded and attached, and the package environment is >>> returned: >> >>>> methods:::.requirePackage("digest") >>> Loading required package: digest <environment: >>> package:digest> attr(,"name") [1] "package:digest" >>> attr(,"path") [1] >>> "/Users/kevin/Library/R/3.3/library/digest" >>>> "digest" %in% loadedNamespaces() >>> [1] TRUE >>>> "package:digest" %in% search() >>> [1] TRUE >> >>> On the other hand, if the package namespace has already >>> been loaded, the package namespace is returned without >>> attaching the package: >> >>>> requireNamespace("digest") >>> Loading required namespace: digest >>>> methods:::.requirePackage("digest") >>> <environment: namespace:digest> >>>> "digest" %in% loadedNamespaces() >>> [1] TRUE >>>> "package:digest" %in% search() >>> [1] FALSE >> >>> This may be intentional, but the behavior seems surprising >>> and could be responsible for the behavior outlined >>> earlier. >> >> Yes, the behavior you outlined earlier is buggy, and I also have >> seen similar bugous behavior for the case of non-exported >> classes. >> >> Part of it is historical: The S4 code was mostly written before >> namespaces were introduced into R; I vaguely remember John >> Chambers (the principal creator of S4) saying that he did not >> intend the formal classes to be not visible... which in some >> sense only contains the fact that he (or anybody) would not >> think much about hidden objects before they were introduced. >> >> Still, in the mean time, most of us have seen many cases where >> we wanted to have "private" classes, and many packages do have >> them, too.... and they "mostly work" ;-) >> >> In other words, I agree that it would be very desirable to get >> to the bottom of this and fix such problems. >> >> .requirePackage() is among the parts of the methods package code >> which are quite delicate (and not much documented AFAIK, the hidden >> .requirePackage() function is a good example!). >> >> Delicate for at least two reasons: >> >> 1) They are not only used in crucial steps when "bootstrapping" >> the methods package ('methods' has to define its own S4 >> generics, methods, and classes before the package "exists"), >> >> 1b) they are also used both when building and installing another >> 'methods'-dependent package. This could be called >> "bootstrapping another package". >> >> 2) they are also much used whenever S4 code (aka >> 'methods'-dependent) packages are loaded and attached, >> so should be as fast as possible. >> I think you know that in recent years there have been >> considerable (and successful!) efforts to speedup package >> loading time, e.g. speeding up 'library(Matrix)' by about >> 2 factors of 2, and changing such functions should not >> deteriorate package-load speed noticably. >> >> Still, it is very desirable to improve / fix the issue: >> After all this musings, I'd currently guess that it would be a >> good idea if .requirePackage() would always return the >> *namespace* of the corresponding package unless that does not >> yet exist, as in the the 'bootstrap' situations above. >> ... or we'd add a new argument to .requirePackage() dealing with >> that, or we use two functions: .requirePackage() needed for >> boostrapping, returning a package envionment, and >> .requireNamespace() used to access class (and generic function!) >> environments. >> >> Well tested patches are very welcome; as is filing a formal >> bug report with bugzilla (https://bugs.r-project.org/ ; >> (if you have no "account" there, we will have to >> manually create one, for now, see the 'Note' on >> https://www.r-project.org/bugs.html because of the spammer >> attacks earlier this month ... but I see you, Kevin are >> registered there), >> or >> just continuing your findings here. >> >> Martin >> >> >>> Best, Kevin >> >>> On Fri, Jul 29, 2016 at 11:37 AM, Kevin Ushey >>> <kevinus...@gmail.com> wrote: >>>> I have a small idea as to what's going on now; at least, >>>> why exporting the class resolves this particular issue. >>>> >>>> Firstly, when an S4 class is not exported, the associated >>>> '.__C__<class>' object is not made part of the package >>>> environment. For example, I see: >>>> >>>>> getAnywhere(".__C__SubMatrix") A single object matching >>>> '.__C__SubMatrix' was found It was found in the following >>>> places namespace:s4inherits with value < ... > >>>> >>>> Note that the symbol is only discovered in the package >>>> namespace. When the class is exported (e.g. with >>>> 'exportClasses(SubMatrix)' in the NAMESPACE file), it's >>>> found both in 'package:s4inherits' and >>>> 'namespace:s4inherits'. >>>> >>>> Secondly, when R attempts to resolve the superclasses for >>>> an S3 class, the function 'methods:::.extendsForS3' is >>>> called. Tracing that code eventually gets us here: >>>> >>>> https://github.com/wch/r-source/blob/trunk/src/library/methods/R/SClasses.R#L255 >>>> >>>> Note that we reach this code path as the S3 class cache >>>> has not been populated yet; ie, this code returns NULL: >>>> >>>> https://github.com/wch/r-source/blob/trunk/src/library/methods/R/SClasses.R#L238-L240 >>>> >>>> So, the class hierarchy is looked up using this code: >>>> >>>> if(isTRUE(nzchar(package))) { whereP <- >>>> .requirePackage(package) value <- get0(cname, whereP, >>>> inherits = inherits) } >>>> >>>> However, because the '.__C__SubMatrix' object is only >>>> made available in the package's namespace, not within the >>>> package environment, it is not resolved, and so lookup >>>> fails. (Presumedly, that lookup is done when initially >>>> building a cache for S3 dispatch?) So, I wonder if that >>>> class lookup should occur within the package's namespace >>>> instead? >>>> >>>> Thanks for your time, Kevin >>>> >>>> On Sat, Jun 25, 2016 at 12:46 PM, Kevin Ushey >>>> <kevinus...@gmail.com> wrote: >>>>> Hi, >>>>> >>>>> (sorry for the wall of text; the issue here appears to >>>>> be rather complicated) >>>>> >>>>> I'm seeing a somewhat strange case where checking >>>>> whether an S4 object inherits from a parent class >>>>> defined from another package with 'inherits' fails if >>>>> that object is materialized through a call to >>>>> 'load'. That's a mouthful, so I've put together a >>>>> relatively small reproducible example online here: >>>>> >>>>> https://github.com/kevinushey/s4inherits >>>>> >>>>> This package, 's4inherits', defines an S4 class, >>>>> 'SubMatrix', that inherits from the 'dsyMatrix' class >>>>> defined in the Matrix package. After installing the >>>>> package, I run some simple tests: >>>>> >>>>> $ R -f test-save.R >>>>> >>>>>> library(s4inherits) data <- SubMatrix(1) >>>>>> >>>>>> is(data, "SubMatrix") >>>>> [1] TRUE >>>>>> inherits(data, "SubMatrix") >>>>> [1] TRUE >>>>>> >>>>>> is(data, "dsyMatrix") >>>>> [1] TRUE >>>>>> inherits(data, "dsyMatrix") >>>>> [1] TRUE >>>>>> >>>>>> save(data, file = "test.RData") >>>>>> >>>>> >>>>> All the inheritance checks report as we would expect. I >>>>> check that the inheritance reports are as expected, then >>>>> save that object to 'test.RData'. I then load that data >>>>> file in a new R session and run the same checks: >>>>> >>>>> $ R -f test-load.R >>>>> >>>>>> library(methods) load("test.RData") >>>>>> >>>>>> inherits(data, "SubMatrix") >>>>> Loading required package: s4inherits [1] TRUE >>>>>> is(data, "SubMatrix") >>>>> [1] TRUE >>>>>> >>>>>> inherits(data, "dsyMatrix") >>>>> [1] FALSE # (??) >>>>>> is(data, "dsyMatrix") >>>>> [1] TRUE >>>>>> >>>>> >>>>> Note that R now reports that my loaded object does _not_ >>>>> inherit from "dsyMatrix", yet this occurs only when >>>>> checked with 'inherits()' -- 'is' produces the expected >>>>> result. >>>>> >>>>> I do not see the behavior if I explicitly load / attach >>>>> the 's4inherits' package before loading the associated >>>>> data file; it only occurs if the package namespace is >>>>> loaded in response to loading the data object hosting a >>>>> 'SubMatrix' object. >>>>> >>>>> More precisely, the package namespace is loaded when the >>>>> promise hosting the data object is evaluated; that >>>>> promise being generated by 'load', if I understand >>>>> correctly. Somehow, evaluation of that promise within >>>>> the call to 'inherits' in this very special case causes >>>>> the unexpected behavior -- ie, if the first thing that >>>>> you do with the loaded object is check its class with >>>>> 'inherits', then you get this unexpected result. >>>>> >>>>> Even more, this behavior seems to go away if the >>>>> 's4inherits' package explicitly exports the class -- >>>>> however, you could imagine this class normally being >>>>> internal to the package, and so it may be undesirable to >>>>> export it. >>>>> >>>>> I checked a bit into the C code, and IIUC the check here >>>>> looks up the class hierarchy in the R_S4_extends_table >>>>> object defined in 'attrib.c', so it seems like that >>>>> mapping is potentially not getting constructed with the >>>>> full hierarchy (although hopefully someone with more >>>>> knowledge of the S4 internals + interaction with S3 can >>>>> elaborate). >>>>> >>>>> (FWIW, this was distilled from a case where S3 dispatch >>>>> on a similar loaded S4 object failed, due to failure to >>>>> resolve the class hierarchy for the S3 dispatch >>>>> context.) >>>>> >>>>> Thanks, Kevin >>>>> >>>>> --- >>>>> >>>>> $ R --slave -e "utils::sessionInfo()" R Under >>>>> development (unstable) (2016-06-13 r70769) Platform: >>>>> x86_64-apple-darwin15.5.0 (64-bit) Running under: OS X >>>>> 10.11.5 (El Capitan) >> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel