Would this scenario satisfy " make the package _directly_ compatible with standard Bioconductor data structures"
If an input is SummarizedExperiment return SummarizedExperiment, if the input is a tbl_df or ttBulk, return ttBulk (?) Best wishes. *Stefano * Stefano Mangiola | Postdoctoral fellow Papenfuss Laboratory The Walter Eliza Hall Institute of Medical Research +61 (0)466452544 Il giorno ven 7 feb 2020 alle ore 16:15 Michael Lawrence < lawrence.mich...@gene.com> ha scritto: > I would urge you to make the package _directly_ compatible with > standard Bioconductor data structures; no explicit conversion. But you > can create wrapper methods (even on an S3 generic) that perform the > conversion automatically. You'll probably want two separate APIs > though (in different styles), for one thing automatic conversion is > obviously not possible for return values. > > Michael > > On Thu, Feb 6, 2020 at 5:34 PM stefano <mangiolastef...@gmail.com> wrote: > > > > Thanks Michael, > > > > yes in a sense, ttBulk and SummariseExperiment can be considere as two > interfaces. Would be fair enough to create a function that convert from one > to the other, although the default would be ttBulk? > > > > > I'm not sure the tidyverse is a great answer to the user interface, > because it lacks domain semantics > > > > Would be fair to say that ttBulk class could be considered a tibble with > specific semantics? In the sense that it holds information about key column > names (.sample, .transcript, .abundance, .normalised_abundance, etc..), and > has a validator (that is triggered at every ttBulk function). > > > > I think at the moment, given (i) S3 problem, and (ii) the lack of formal > foundation on SummaisedExperiment interface (that maybe would require an S4 > technology itself, where SummariseExperiment could be a slot?) my package > would belong more to CRAN, until those two issues will have been resolved. > > > > I imagine there are not many cases where a CRAN package migrated to > Bioconductor after complying with the ecosystem policies. > > > > Thanks a lot. > > > > Best wishes. > > > > Stefano > > > > > > > > Stefano Mangiola | Postdoctoral fellow > > > > Papenfuss Laboratory > > > > The Walter Eliza Hall Institute of Medical Research > > > > +61 (0)466452544 > > > > > > > > Il giorno ven 7 feb 2020 alle ore 12:12 Michael Lawrence < > lawrence.mich...@gene.com> ha scritto: > >> > >> There's a difference between implementing software, where one wants > >> formal data structures, and providing a convenient user interface. > >> Software needs to interface with other software, so a package could > >> provide both types of interfaces, one based on rich (S4) data > >> structures, another on simpler structures with an API more amenable to > >> analysis. I'm not sure the tidyverse is a great answer to the user > >> interface, because it lacks domain semantics. This is still an active > >> area of research (see Stuart Lee's plyranges, for example). I hope you > >> can find a reasonable compromise that enables you to integrate ttBulk > >> into Bioconductor, so that it can take advantage of the synergies the > >> ecosystem provides. > >> > >> PS: There is no simple fix for your example. > >> > >> Michael > >> > >> On Thu, Feb 6, 2020 at 4:12 PM stefano <mangiolastef...@gmail.com> > wrote: > >> > > >> > Thanks a lot for your comment Martin and Michael, > >> > > >> > Here I reply to Marti's comment. Michael I will try to implement your > >> > solution! > >> > > >> > I think a key point from > >> > > https://github.com/Bioconductor/Contributions/issues/1355#issuecomment-580977106 > >> > (that I was under-looking) is > >> > > >> > *>> "So to sum up: if you submit a package to Bioconductor, there is > an > >> > expectation that your package can work seamlessly with other > Bioconductor > >> > packages, and your implementation should support that. The safest and > >> > easiest way to do that is to use Bioconductor data structures"* > >> > > >> > In this case my package would not be suited as I do not use > pre-existing > >> > Bioconductor data structures, but instead i see value in using a > simple > >> > tibble, for the reasons in part explained in the README > >> > https://github.com/stemangiola/ttBulk (harvesting the power of > tidyverse > >> > and friends for bulk transcriptomic analyses). > >> > > >> > *>> "with the minimum standard of being able to accept such objects > even if > >> > you do not rely on them internally (though you should)"* > >> > > >> > With this I can comply in the sense that I can built converters to > and from > >> > SummarizedExperiment (for example). > >> > > >> > * >> "If you don't want to do that, then that's a shame, but it would > >> > suggest that Bioconductor would not be the right place to host this > >> > package."* > >> > > >> > Well said. > >> > > >> > In summary, I do not rely on Bioconductor data structure, as I am > proposing > >> > another paradigm, but my back end is made of largely Bioconductor > analysis > >> > packages that I would like to interface with tidyverse. So > >> > > >> > 1) Should I build converters to Bioc. data structures, and force the > use of > >> > S3 object (needed to tiidyverse to work), or > >> > 2) Submit to CRAN > >> > > >> > I don't have strong feeling for either, although I think Bioconductor > would > >> > be a good fit. Please community give me your honest opinions, I will > take > >> > them seriously and proceed. > >> > > >> > > >> > > >> > Best wishes. > >> > > >> > *Stefano * > >> > > >> > > >> > > >> > Stefano Mangiola | Postdoctoral fellow > >> > > >> > Papenfuss Laboratory > >> > > >> > The Walter Eliza Hall Institute of Medical Research > >> > > >> > +61 (0)466452544 > >> > > >> > > >> > Il giorno ven 7 feb 2020 alle ore 10:46 Martin Morgan < > >> > mtmorgan.b...@gmail.com> ha scritto: > >> > > >> > > The idea isn't to use S4 at any cost, but to 'play well' with the > >> > > Bioconductor ecosystem, including writing robust and maintainable > code. > >> > > > >> > > This comment > >> > > > https://github.com/Bioconductor/Contributions/issues/1355#issuecomment-580977106 > >> > > provides some motivation; there was also an interesting exchange on > the > >> > > Bioconductor community slack about this (join at > >> > > https://bioc-community.herokuapp.com/; discussion starting with > >> > > > https://community-bioc.slack.com/archives/C35G93GJH/p1580144746014800). > >> > > The plyranges package http://bioconductor.org/packages/plyranges > and > >> > > recently accepted fluentGenomics workflow > >> > > https://github.com/Bioconductor/Contributions/issues/1350 provide > >> > > illustrations. > >> > > > >> > > In your domain it's really surprising that your package does not use > >> > > (Import or Depend on) SummarizedExperiment or GenomicRanges > packages. From > >> > > a superficial look at your package, it seems like something like > >> > > `reduce_dimensions()` could be defined to take & return a > >> > > SummarizedExperiment and hence benefit from some of the points in > the > >> > > github issue comment mentioned above. > >> > > > >> > > Certainly there is a useful transition, both 'on the way in' to a > >> > > SummarizedExperiment, and after leaving the more specialized > bioinformatic > >> > > computations to, e.g., display a pairs plot of the reduced > dimensions, > >> > > where one might re-shape the data to a tidy format and use 'plain > old' > >> > > tibbles; the fluentGenomics workflow might provide some guidance. > >> > > > >> > > At the end of the day it would not be surprising for Bioconductor > packages > >> > > to make use of tidy concepts and data structures, particularly in > the > >> > > vignette, and it would be a mistake for Bioconductor to exclude > >> > > well-motivated 'tidy' representations. > >> > > > >> > > Martin Morgan > >> > > > >> > > On 2/6/20, 5:46 PM, "Bioc-devel on behalf of stefano" < > >> > > bioc-devel-boun...@r-project.org on behalf of > mangiolastef...@gmail.com> > >> > > wrote: > >> > > > >> > > Hello, > >> > > > >> > > I have a package (ttBulk) under review. I have been told to > replace > >> > > the S3 > >> > > system to S4. My package is based on the class tbl_df and must > be fully > >> > > compatible with tidyverse methods (inheritance). After some > tests and > >> > > research I understood that tidyverse ecosystem is not > compatible with > >> > > S4 > >> > > classes. > >> > > > >> > > For example, several methos do not apparently handle S4 > objects based > >> > > on > >> > > S3 tbl_df > >> > > > >> > > ```library(tidyverse)setOldClass("tbl_df") > >> > > setClass("test2", contains = "tbl_df") > >> > > my <- new("test2", tibble(a = 1)) > >> > > my %>% mutate(b = 3) > >> > > > >> > > a b > >> > > 1 1 3 > >> > > ``` > >> > > > >> > > ```my <- new("test2", tibble(a = rnorm(100), b = 1)) > >> > > my %>% nest(data = -b) > >> > > Error: `x` must be a vector, not a `test2` object > >> > > Run `rlang::last_error()` to see where the error occurred. > >> > > ``` > >> > > > >> > > Could you please advise whether a tidyverse based package can be > >> > > hosted on > >> > > Bioconductor, and if S4 classes are really mandatory? I need to > >> > > understand > >> > > if I am forced to submit to CRAN instead (although Bioconductor > would > >> > > be a > >> > > good fit, sice I try to interface transcriptional analysis > tools to > >> > > tidy > >> > > universe) > >> > > > >> > > > >> > > Thanks a lot. > >> > > Stefano > >> > > > >> > > [[alternative HTML version deleted]] > >> > > > >> > > _______________________________________________ > >> > > Bioc-devel@r-project.org mailing list > >> > > https://stat.ethz.ch/mailman/listinfo/bioc-devel > >> > > > >> > > > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > _______________________________________________ > >> > Bioc-devel@r-project.org mailing list > >> > https://stat.ethz.ch/mailman/listinfo/bioc-devel > >> > >> > >> > >> -- > >> Michael Lawrence > >> Senior Scientist, Bioinformatics and Computational Biology > >> Genentech, A Member of the Roche Group > >> Office +1 (650) 225-7760 > >> micha...@gene.com > >> > >> Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube > > > > -- > Michael Lawrence > Senior Scientist, Bioinformatics and Computational Biology > Genentech, A Member of the Roche Group > Office +1 (650) 225-7760 > micha...@gene.com > > Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel