Hi Martin, John,

Thanks for the responses! I've tidied up some of the notes from this
mailing list thread and posted them on the bug tracker.

John, in this case, I think namespaces are relevant because for
non-exported S4 classes, the class information is made available
through the '.__C__<package>' symbol in the package's namespace, but
not the package environment that gets attached to the search path. In
this (rare, yet not impossible) sequence of events, it looks like R
attempts to resolve the '.__C__<package>' symbol in the wrong
environment, and so class information lookup fails, and we end up
caching the wrong inheritance information.

Thanks,
Kevin

On Sun, Jul 31, 2016 at 5:12 AM, John Chambers <j...@r-project.org> wrote:
> (Just returning from the "wilds" of Canada, so not able to comment on the 
> specifics, but ...)
>
> There is a basic point about generic functions that may be related to the 
> "private" class question and my earlier remarks that Martin alluded to.
>
> R (and S4 before it)  allows packages to define methods for a generic 
> function in another package.  Say, for plot() in graphics.
>
> The current model is that the generic plot() remains one function, 
> specifically a generic associated with the graphics package.
>
> Package A might define a method corresponding to one or two classes defined 
> in that package.  When A is loaded, those methods are added to the table for 
> plot() in the current session.
>
> Now suppose a user calls a function, foo(), in package B, and that foo() in 
> turn calls plot().  This is the same plot() function, and in particular will 
> include the methods supplied from package A.
>
> This is regardless of the two packages having any overt connection.  Also, 
> the methods are accepted by the generic function regardless of whether the 
> class is explicitly exported or not.  In this sense, classes cannot be 
> entirely private if methods are defined for a non-local function.  Namespaces 
> are not directly relevant.
>
> Whether this can lead to strange behavior isn't clear, and if so, is it a 
> sign that something undesirable was done in the particular example?  (In 
> Extending R, I suggested a "right to write methods" that  would discourage a 
> package from having methods unless it owned the function or some of the 
> classes.)
>
> R could adopt a different model for generic functions, where a package that 
> defined a method for a non-exported class would create a "local" version of 
> the generic, but that would likely raise some other issues.
>
> But seems like a useful topic for discussion.
>
> John
>
> On Jul 30, 2016, at 11:07 AM, Martin Maechler <maech...@stat.math.ethz.ch> 
> wrote:
>
>>>>>>> Kevin Ushey <kevinus...@gmail.com>
>>>>>>>    on Fri, 29 Jul 2016 11:46:19 -0700 writes:
>>
>>> I should add one more item that may be related here --
>>> calling 'methods:::.requirePackage' returns a different
>>> result based on whether the package namespace is already
>>> loaded or not.
>>
>>> If the package namespace is not loaded, the package is
>>> loaded and attached, and the package environment is
>>> returned:
>>
>>>> methods:::.requirePackage("digest")
>>>    Loading required package: digest <environment:
>>> package:digest> attr(,"name") [1] "package:digest"
>>> attr(,"path") [1]
>>> "/Users/kevin/Library/R/3.3/library/digest"
>>>> "digest" %in% loadedNamespaces()
>>>    [1] TRUE
>>>> "package:digest" %in% search()
>>>    [1] TRUE
>>
>>> On the other hand, if the package namespace has already
>>> been loaded, the package namespace is returned without
>>> attaching the package:
>>
>>>> requireNamespace("digest")
>>>    Loading required namespace: digest
>>>> methods:::.requirePackage("digest")
>>>    <environment: namespace:digest>
>>>> "digest" %in% loadedNamespaces()
>>>    [1] TRUE
>>>> "package:digest" %in% search()
>>>    [1] FALSE
>>
>>> This may be intentional, but the behavior seems surprising
>>> and could be responsible for the behavior outlined
>>> earlier.
>>
>> Yes, the behavior you outlined earlier is buggy, and I also have
>> seen similar bugous behavior for the case of non-exported
>> classes.
>>
>> Part of it is historical:  The S4 code was mostly written before
>> namespaces were introduced into R;   I vaguely remember John
>> Chambers (the principal creator of S4) saying that he did not
>> intend the formal classes to be not visible... which in some
>> sense only contains the fact that he (or anybody) would not
>> think much about hidden objects before they were introduced.
>>
>> Still, in the mean time, most of us have seen many cases where
>> we wanted to have "private" classes,  and many packages do have
>> them, too.... and they "mostly work" ;-)
>>
>> In other words, I agree that it would be very desirable to get
>> to the bottom of this and fix such problems.
>>
>> .requirePackage() is among the parts of the methods package code
>> which are quite delicate (and not much documented AFAIK, the hidden
>> .requirePackage() function is a good example!).
>>
>> Delicate for at least two reasons:
>>
>> 1) They are not only used in crucial steps when "bootstrapping"
>>   the methods package ('methods' has to define its own S4
>>   generics, methods, and classes before the package "exists"),
>>
>> 1b) they are also used both when building and installing another
>>    'methods'-dependent package.  This could be called
>>    "bootstrapping another package".
>>
>> 2) they are also much used whenever S4 code (aka
>>   'methods'-dependent) packages are loaded and attached,
>>   so should be as fast as possible.
>>   I think you know that in recent years there have been
>>   considerable (and successful!) efforts to speedup package
>>   loading time, e.g. speeding up 'library(Matrix)' by about
>>   2 factors of 2, and changing such functions should not
>>   deteriorate package-load speed noticably.
>>
>> Still, it is very desirable to improve / fix the issue:
>> After all this musings, I'd currently guess that it would be a
>> good idea if  .requirePackage()  would always return the
>> *namespace* of the corresponding package unless that does not
>> yet exist, as in the the 'bootstrap' situations above.
>> ... or we'd add a new argument to .requirePackage() dealing with
>> that, or we use two functions:  .requirePackage() needed for
>> boostrapping, returning a package envionment, and
>> .requireNamespace() used to access class (and generic function!)
>> environments.
>>
>> Well tested patches are very welcome; as is filing a formal
>> bug report with bugzilla (https://bugs.r-project.org/ ;
>> (if you have no "account" there, we will have to
>> manually create one, for now, see the 'Note' on
>> https://www.r-project.org/bugs.html because of the spammer
>> attacks earlier this month ... but I see you, Kevin are
>> registered there),
>> or
>> just continuing your findings here.
>>
>> Martin
>>
>>
>>> Best, Kevin
>>
>>> On Fri, Jul 29, 2016 at 11:37 AM, Kevin Ushey
>>> <kevinus...@gmail.com> wrote:
>>>> I have a small idea as to what's going on now; at least,
>>>> why exporting the class resolves this particular issue.
>>>>
>>>> Firstly, when an S4 class is not exported, the associated
>>>> '.__C__<class>' object is not made part of the package
>>>> environment.  For example, I see:
>>>>
>>>>> getAnywhere(".__C__SubMatrix") A single object matching
>>>> '.__C__SubMatrix' was found It was found in the following
>>>> places namespace:s4inherits with value < ... >
>>>>
>>>> Note that the symbol is only discovered in the package
>>>> namespace. When the class is exported (e.g. with
>>>> 'exportClasses(SubMatrix)' in the NAMESPACE file), it's
>>>> found both in 'package:s4inherits' and
>>>> 'namespace:s4inherits'.
>>>>
>>>> Secondly, when R attempts to resolve the superclasses for
>>>> an S3 class, the function 'methods:::.extendsForS3' is
>>>> called. Tracing that code eventually gets us here:
>>>>
>>>> https://github.com/wch/r-source/blob/trunk/src/library/methods/R/SClasses.R#L255
>>>>
>>>> Note that we reach this code path as the S3 class cache
>>>> has not been populated yet; ie, this code returns NULL:
>>>>
>>>> https://github.com/wch/r-source/blob/trunk/src/library/methods/R/SClasses.R#L238-L240
>>>>
>>>> So, the class hierarchy is looked up using this code:
>>>>
>>>> if(isTRUE(nzchar(package))) { whereP <-
>>>> .requirePackage(package) value <- get0(cname, whereP,
>>>> inherits = inherits) }
>>>>
>>>> However, because the '.__C__SubMatrix' object is only
>>>> made available in the package's namespace, not within the
>>>> package environment, it is not resolved, and so lookup
>>>> fails. (Presumedly, that lookup is done when initially
>>>> building a cache for S3 dispatch?) So, I wonder if that
>>>> class lookup should occur within the package's namespace
>>>> instead?
>>>>
>>>> Thanks for your time, Kevin
>>>>
>>>> On Sat, Jun 25, 2016 at 12:46 PM, Kevin Ushey
>>>> <kevinus...@gmail.com> wrote:
>>>>> Hi,
>>>>>
>>>>> (sorry for the wall of text; the issue here appears to
>>>>> be rather complicated)
>>>>>
>>>>> I'm seeing a somewhat strange case where checking
>>>>> whether an S4 object inherits from a parent class
>>>>> defined from another package with 'inherits' fails if
>>>>> that object is materialized through a call to
>>>>> 'load'. That's a mouthful, so I've put together a
>>>>> relatively small reproducible example online here:
>>>>>
>>>>> https://github.com/kevinushey/s4inherits
>>>>>
>>>>> This package, 's4inherits', defines an S4 class,
>>>>> 'SubMatrix', that inherits from the 'dsyMatrix' class
>>>>> defined in the Matrix package.  After installing the
>>>>> package, I run some simple tests:
>>>>>
>>>>> $ R -f test-save.R
>>>>>
>>>>>> library(s4inherits) data <- SubMatrix(1)
>>>>>>
>>>>>> is(data, "SubMatrix")
>>>>> [1] TRUE
>>>>>> inherits(data, "SubMatrix")
>>>>> [1] TRUE
>>>>>>
>>>>>> is(data, "dsyMatrix")
>>>>> [1] TRUE
>>>>>> inherits(data, "dsyMatrix")
>>>>> [1] TRUE
>>>>>>
>>>>>> save(data, file = "test.RData")
>>>>>>
>>>>>
>>>>> All the inheritance checks report as we would expect. I
>>>>> check that the inheritance reports are as expected, then
>>>>> save that object to 'test.RData'. I then load that data
>>>>> file in a new R session and run the same checks:
>>>>>
>>>>> $ R -f test-load.R
>>>>>
>>>>>> library(methods) load("test.RData")
>>>>>>
>>>>>> inherits(data, "SubMatrix")
>>>>> Loading required package: s4inherits [1] TRUE
>>>>>> is(data, "SubMatrix")
>>>>> [1] TRUE
>>>>>>
>>>>>> inherits(data, "dsyMatrix")
>>>>> [1] FALSE # (??)
>>>>>> is(data, "dsyMatrix")
>>>>> [1] TRUE
>>>>>>
>>>>>
>>>>> Note that R now reports that my loaded object does _not_
>>>>> inherit from "dsyMatrix", yet this occurs only when
>>>>> checked with 'inherits()' -- 'is' produces the expected
>>>>> result.
>>>>>
>>>>> I do not see the behavior if I explicitly load / attach
>>>>> the 's4inherits' package before loading the associated
>>>>> data file; it only occurs if the package namespace is
>>>>> loaded in response to loading the data object hosting a
>>>>> 'SubMatrix' object.
>>>>>
>>>>> More precisely, the package namespace is loaded when the
>>>>> promise hosting the data object is evaluated; that
>>>>> promise being generated by 'load', if I understand
>>>>> correctly. Somehow, evaluation of that promise within
>>>>> the call to 'inherits' in this very special case causes
>>>>> the unexpected behavior -- ie, if the first thing that
>>>>> you do with the loaded object is check its class with
>>>>> 'inherits', then you get this unexpected result.
>>>>>
>>>>> Even more, this behavior seems to go away if the
>>>>> 's4inherits' package explicitly exports the class --
>>>>> however, you could imagine this class normally being
>>>>> internal to the package, and so it may be undesirable to
>>>>> export it.
>>>>>
>>>>> I checked a bit into the C code, and IIUC the check here
>>>>> looks up the class hierarchy in the R_S4_extends_table
>>>>> object defined in 'attrib.c', so it seems like that
>>>>> mapping is potentially not getting constructed with the
>>>>> full hierarchy (although hopefully someone with more
>>>>> knowledge of the S4 internals + interaction with S3 can
>>>>> elaborate).
>>>>>
>>>>> (FWIW, this was distilled from a case where S3 dispatch
>>>>> on a similar loaded S4 object failed, due to failure to
>>>>> resolve the class hierarchy for the S3 dispatch
>>>>> context.)
>>>>>
>>>>> Thanks, Kevin
>>>>>
>>>>> ---
>>>>>
>>>>> $ R --slave -e "utils::sessionInfo()" R Under
>>>>> development (unstable) (2016-06-13 r70769) Platform:
>>>>> x86_64-apple-darwin15.5.0 (64-bit) Running under: OS X
>>>>> 10.11.5 (El Capitan)
>>
>>> ______________________________________________
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> ______________________________________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to