Returning to this topic: It's good to hear some of the rationale behind the current state of affairs. That said, the set-up we have now is quite difficult to work with; as mentioned before, I've had to hack around it like:
# Example from "BiocSingular", https://github.com/LTLA/BiocSingular .safe_colSums <- function(x) { if (is(x, "Matrix")) { Matrix::colSums(x) } else { colSums(x) } } ... which is ugly, and even worse, still incorrect, e.g., for non- Matrix classes that have methods for the implicit colSums generic. This situation is not sustainable for further package development. Is there a path forward that is palatable to everyone? Or perhaps these conversations are already happening on R-devel? -A On Tue, 2019-01-29 at 18:46 +0000, Pages, Herve wrote: > Yes the help system could enforce the full signature for the aliases > but > that means the end user then will have to always do > ?`colSums,SomeClass,ANY,ANY-method`, which feels unnecessary > complicated > and confusing in the case of a generic where dispatching on the 2nd > and > 3rd arguments hardly makes sense. > > Or are you saying that the help system should enforce an alias that > strictly matches the signature explicitly used in the setMethod > statement? Problem with this is that then there is no easy way for > the > end user to know a priori which form to use to access the man page. > Is > it ?`colSums,dgCMatrix,ANY,ANY-method` or is it > ?`colSums,dgCMatrix-method`. Right now when you type colSums<ENTERN> > (after loading the Matrix package), you get this: > > > library(Matrix) > > colSums > standardGeneric for "colSums" defined from package "base" > > function (x, na.rm = FALSE, dims = 1, ...) > standardGeneric("colSums") > <bytecode: 0x591c7d0> > <environment: 0x591a408> > Methods may be defined for arguments: x, na.rm, dims > Use showMethods("colSums") for currently available ones. > > This suggests that the correct form is ?`colSums,dgCMatrix,ANY,ANY- > method`. > > All this confusion can be avoided by specifying signature="x" in the > definition of the implicit generic. It formalizes where dispatch > really > happens and sets expectations upfront. No loose ends. > > Hope this makes sense, > > H. > > On 1/29/19 09:43, Martin Maechler wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Pages, Herve > > > > > > > on Tue, 29 Jan 2019 16:44:47 +0000 writes: > > > Hi Martin. Speed is not the concern: I just did some > > > quick benchmarking and didn't observe any significant > > > difference in method dispatch performance after doing > > > setGeneric("toto", function(x, a=0, b=0, c=0) > > > standardGeneric("toto")) vs doing setGeneric("toto", > > > signature="x", function(x, a=0, b=0, c=0) > > > standardGeneric("toto")). > > > > > Here is the real concern to me: > > > > > Aliases of the form > > > \alias{colSums,dgCMatrix,ANY,ANY-method} are a real pain > > > to maintain. It's also a pain for the end user to have to > > > do ?`colSums,dgCMatrix,ANY,ANY-method` to access the man > > > page for a particular method. I know Matrix uses "short" > > > aliases i.e. aliases of the form > > > \alias{colSums,dgCMatrix-method} so the user only needs to > > > do ?`colSums,dgCMatrix-method` but there is a lot of > > > fragility to the situation. > > > > > Here is why: The exact form that needs to be used for > > > these aliases can change anytime depending on what happens > > > in one of the upstream packages (not a concern for the > > > Matrix package because no upstream package defines colSums > > > methods). More precisely: If all the colSums methods > > > defined in the upstream packages and in my own package are > > > defined with setMethod statements of the form: > > > > > setMethod("colSums", signature(x="SomeClass"), ...) > > > > > then the aliases in the man pages must be of the form > > > \alias{colSums,SomeClass-method} and the end user can just > > > do ?`colSums,SomeClass-method`, which is good. But if > > > **one** upstream package decides to use a setMethod > > > statement of the form: > > > > > setMethod("colSums", signature(x="SomeClass", > > > na.rm="ANY", dims="ANY"), ...) > > > > > then the aliases for **all** the colSums methods in > > > **all** the downstream packages now must be of the form > > > \alias{colSums,SomeOtherClass,ANY,ANY-method}, even if the > > > method for SomeOtherClass is defined with > > > signature(x="SomeOtherClass"). > > > > Hmm... but to me, the behavior you describe in the above paragraph > > seems rather an implementation "infelicity" in R's help / > > documentation system, > > than an intrinsic necessity. Or have you thought more about > > this and discussed it with other S4 experts (John Chambers, > > Michael L., Martin Morgan, ...) and came to a different > > conclusion? > > > > Very generally: > > > > Just because the documentation (help system) > > rules are implemented as they are should *NOT* influence "the > > best way" to program things in R. > > > > and particularly for something such as S4 which has been adapted > > and tuned for a long time ... > > > > So, rather the documentation "setup" should adapt to what seems > > best from an R coding point of view. > > > > More specifically, if we are allowed to use short signatures in R > > code, i.e., > > signature(x=<someClass>) > > short for > > signature(x=<someClass>, na.m="ANY", dims="ANY") > > > > then the documentation \alias{} should allow to use the same > > principle, as the documentation / help "keys" which \alias{.} > > constructs > > will be similarly uniquely determined > > (at least as long as other packages do not describe methods for > > "my" <someClass>) > > > > So, the help / documentation (and "R CMD check" checks) should > > have been changed long ago, if you had sent patches to do so, > > n'est-ce pas? :-) ;-) [[yes, half jokingly]]. > > > > Martin > > > > > Also, as a consequence, now > > > the end user has to use the long syntax to access the man > > > pages for these methods. And if later the author of the > > > upstream package decides to switch back to the > > > setMethod("colSums", signature(x="SomeClass"), ...) form, > > > then I have to switch back all the aliases in all my > > > downstream packages to the short form again! > > > > > This fragility of the alias syntax was one of the > > > motivations for me to put many setGeneric statements of > > > the form setGeneric("someGeneric", signature="x") in > > > BiocGenerics over the years. So I don't have many dozens > > > of aliases that suddenly break for mysterious reasons ('R > > > CMD check' would suddenly starts reporting warnings for > > > these aliases despite no changes in my package or in R). > > > > > Best, > > > > > H. > > > > > On 1/29/19 03:16, Martin Maechler wrote: > > >>>>>>> Michael Lawrence on Mon, 28 Jan 2019 20:47:58 -0800 > > >>>>>>> writes: > > >> > That will have some consequences; for example, > > > >> documentation aliases will need to change. Not sure how > > > >> many packages will need to be fixed outside of Matrix, > > >> but > it's not an isolated change. Martin might comment > > >> on the > rationale for the full signature, since he > > >> defined those > generics. > > >> > > >> > On Mon, Jan 28, 2019 at 7:21 PM Pages, Herve > > > >> <hpa...@fredhutch.org> wrote: > > >> >> > > >> >> Actually there is a 4th solution which is to modify > > >> the >> definition of the implicit generics in the methods > > >> >> package (file makeBasicFunsList.R) to make them > > >> dispatch >> on their 1st arg only. Should be easy. Then > > >> no package >> will need to use a setGeneric statement >> > > >> anymore. Everybody will automatically get a clean >> > > >> implicit generic. Including the Matrix package which >> > > >> shouldn't need to be touched (except maybe for some >> > > >> aliases in its man page that might need to be changed >> > > >> from \alias{colSums,dgCMatrix,ANY,ANY-method} to >> > > >> \alias{colSums,dgCMatrix-method}). > > >> >> > > >> >> Anybody wants to try to make a patch for this? > > >> > > >> >> H. > > >> > > >> I've already replied without having read the above two > > >> messages. In my reply I had indeed more or less argued > > >> as Hervé does above. > > >> > > >> Michael, Hervé, .. : Why is it really so much better to > > >> disallow dispatch for the other compulsory arguments? > > >> Dispatch there allows to use methods for class "missing" > > >> which is nicer in my eyes than the traditional default > > >> argument + missing() "tricks". > > >> > > >> Is it mainly speed you are concerned about. If yes, do > > >> we have data (and data analysis) about performance here? > > >> > > >> Martin > > >> > > >> >> > > >> >> On 1/28/19 19:00, Michael Lawrence wrote: > I agree > > >> (2) >> is a good compromise. CC'ing Martin for his > > >> perspective. > > >> >> > > > >> >> > Michael > > >> >> > > > >> >> > On Mon, Jan 28, 2019 at 6:58 PM Pages, Herve >> > > >> <hpa...@fredhutch.org> wrote: >> Hi Aaron, > > >> >> >> > > >> >> >> The 4 matrix summarization generics currently > > >> defined >> in BiocGenerics >> are defined as followed: > > >> >> >> > > >> >> >> setGeneric("rowSums", signature="x") >> >> > > >> setGeneric("colSums", signature="x") >> >> > > >> setGeneric("rowMeans", signature="x") >> >> > > >> setGeneric("colMeans", signature="x") > > >> >> >> > > >> >> >> The only reason for having these definitions in >> > > >> BiocGenerics is to >> restrict dispatch the first >> > > >> argument. This is cleaner than what we would >> get with > > >> >> the implicit generics where dispatch is on all > > >> arguments >> (it >> doesn't really make sense to dispatch > > >> on toggles >> like 'na.rm' or >> 'dims'). Sticking to > > >> simple dispatch >> when possible makes life easier for >> > > >> the developer >> (especially in times of troubleshooting) > > >> and for the user >> >> (methods are easier to discover > > >> and their man pages >> easier to access). > > >> >> >> > > >> >> >> However, the 4 statements above create new generics > > >> >> that mask the >> implicit generics defined in the > > >> Matrix >> package (Matrix doesn't contain >> any > > >> setGeneric >> statements for these generics, only > > >> setMethod >> >> statements). This is a very unsatisfying > > >> situation and it >> has hit me >> repeatedly over the > > >> last couple of years. > > >> >> >> > > >> >> >> We have basically 3 ways to go. From simpler to > > >> more >> complicated: > > >> >> >> > > >> >> >> 1) Give up on single dispatch for these > > >> generics. That >> is, we remove the >> 4 statements above > > >> from >> BiocGenerics. Then we use setMethod() in package > > >> >> code >> like Matrix does. > > >> >> >> > > >> >> >> 2) Convince the Matrix folks to put the 4 > > >> statements >> above in Matrix. >> Then any BioC package > > >> that needs to >> define methods for these generics >> > > >> would just need to >> import them from the Matrix > > >> package. Maybe we could >> >> even push this one step > > >> further by having BiocGenerics >> import and >> re-export > > >> these generics. This would make >> them "available" in > > >> BioC as >> soon as the BiocGenerics >> is loaded (and any > > >> package that needs to define >> >> methods on them would > > >> just need to import them from >> BiocGenerics). > > >> >> >> > > >> >> >> 3) Put the 4 statements above in a MatrixGenerics > > >> >> package. Then convince >> the Matrix folks to define > > >> >> methods on the generics defined in >> >> > > >> MatrixGenerics. Very unlikely to happen! > > >> >> >> > > >> >> >> IMO 2) is the best compromise. Want to give it a > > >> shot? > > >> >> >> > > >> >> >> H. > > >> >> >> > > >> >> >> > > >> >> >> On 1/27/19 13:45, Aaron Lun wrote: >>> This is a >> > > >> resurrection of some old threads: > > >> >> >>> > > >> >> >>> > > >> >> > > >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.e > > thz.ch_pipermail_bioc-2Ddevel_2017- > > 2DNovember_012273.html&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeA > > vimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=O21AQgvbUp3XRwM4jf0WeZA2ePj9y > > T3fc2X5hOsKNJk&s=pcpUyjpkQe6U79lZ_n2SANNp6Zj_s6i1Sq2yZx2NSjw&e= > > >> >> >>> > > >> >> >>> > > >> >> > > >> https://urldefense.proofpoint.com/v2/url?u=https-3A__github > > .com_Bioconductor_MatrixGenerics_issues&d=DwIDaQ&c=eRAMFD45gAfqt84V > > tBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=O21AQgvbUp3X > > RwM4jf0WeZA2ePj9yT3fc2X5hOsKNJk&s=NrmcVnmvgkDp3p64J- > > izZU9VD5nFsFCWOTI-TsnzCpY&e= > > >> >> >>> > > >> >> >>> For those who are unfamiliar with this, the basic > > >> >> issue is that various >>> Matrix and BiocGenerics >> > > >> functions mask each other. This is mildly >>> frustrating > > >> >> in interactive sessions: > > >> >> >>> > > >> >> >>>> library(Matrix) >>>> library(DelayedArray) >>>> x > > >> <- >> rsparsematrix(10, 10, 0.1) >>>> colSums(x) # fails > > >> >>>> >> Matrix::colSums(x) # okay >>> ... but quite > > >> annoying >> during package development, requiring code > > >> like >>> this: > > >> >> >>> > > >> >> >>> if (is(x, "Matrix")) { >>> z <- Matrix::colSums(x) > > >> >> >>> } else { >>> z <- colSums(x) # assuming > > >> DelayedArray >> does the masking. >>> } > > >> >> >>> > > >> >> >>> ... which defeats the purpose of using S4 dispatch > > >> in >> the first place. > > >> >> >>> > > >> >> >>> I have been encountering this issue with > > >> increasing >> frequency in my >>> packages, as a lot of > > >> my code base >> needs to be able to interface with >>> > > >> both Matrix and >> Bioconductor objects (e.g., > > >> DelayedMatrices) at the >>> >> same time. What needs to > > >> happen so that I can just write: > > >> >> >>> > > >> >> >>> z <- colSums(x) > > >> >> >>> > > >> >> >>> ... and everything will work for both Matrix and > > >> >> Bioconductor classes? >>> It seems that many of these > > >> >> function names are implicit generics >>> anyway, can > > >> >> BiocGenerics take advantage of that for the time > > >> being? > > >> >> >>> > > >> >> >>> Best, > > >> >> >>> > > >> >> >>> Aaron > > >> >> >>> > > >> >> >>> _______________________________________________ > > >> >>> >> Bioc-devel@r-project.org mailing list >>> >> > > >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.e > > thz.ch_mailman_listinfo_bioc- > > 2Ddevel&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYb > > W0WYiZvSXAJJKaaPhzWA&m=O21AQgvbUp3XRwM4jf0WeZA2ePj9yT3fc2X5hOsKNJk& > > s=JtgGBnaZJH44fV8OUp-SwnHxhD_i_mdVkqoMfUoA5tM&e= > > >> >> >> -- > > >> >> >> Hervé Pagès > > >> >> >> > > >> >> >> Program in Computational Biology >> Division of > > >> Public >> Health Sciences >> Fred Hutchinson Cancer > > >> Research Center >> >> 1100 Fairview Ave. N, M1-B514 >> > > >> P.O. Box 19024 >> >> Seattle, WA 98109-1024 > > >> >> >> > > >> >> >> E-mail: hpa...@fredhutch.org >> Phone: (206) > > >> 667-5791 >> >> Fax: (206) 667-1319 > > >> >> >> > > >> >> >> _______________________________________________ >> > > >> >> Bioc-devel@r-project.org mailing list >> >> > > >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.e > > thz.ch_mailman_listinfo_bioc- > > 2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYb > > W0WYiZvSXAJJKaaPhzWA&m=c- > > Mmi30ouubEEHC5W9_X6DIwxblt1nQlIfgCaK8uCJU&s=U8Hu1kzglD_RP7t_eR5w_nY > > AIaupBgrEKx11geSZwVg&e= > > >> >> > > >> >> -- > > >> >> Hervé Pagès > > >> >> > > >> >> Program in Computational Biology Division of Public >> > > >> Health Sciences Fred Hutchinson Cancer Research Center >> > > >> 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA > > >> >> 98109-1024 > > >> >> > > >> >> E-mail: hpa...@fredhutch.org Phone: (206) 667-5791 > > >> Fax: >> (206) 667-1319 > > >> >> > > > > > -- > > > Hervé Pagès > > > > > Program in Computational Biology Division of Public Health > > > Sciences Fred Hutchinson Cancer Research Center 1100 > > > Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA > > > 98109-1024 > > > > > E-mail: hpa...@fredhutch.org Phone: (206) 667-5791 Fax: > > > (206) 667-1319 > > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpa...@fredhutch.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel