Thanks Michael, yes in a sense, ttBulk and SummariseExperiment can be considere as two interfaces. Would be fair enough to create a function that convert from one to the other, although the default would be ttBulk?
*> I'm not sure the tidyverse is a great answer to the user interface, because it lacks domain semantics * Would be fair to say that ttBulk class could be considered a tibble with specific semantics? In the sense that it holds information about key column names (.sample, .transcript, .abundance, .normalised_abundance, etc..), and has a validator (that is triggered at every ttBulk function). I think at the moment, given (i) S3 problem, and (ii) the lack of formal foundation on SummaisedExperiment interface (that maybe would require an S4 technology itself, where SummariseExperiment could be a slot?) my package would belong more to CRAN, until those two issues will have been resolved. I imagine there are not many cases where a CRAN package migrated to Bioconductor after complying with the ecosystem policies. Thanks a lot. Best wishes. *Stefano * Stefano Mangiola | Postdoctoral fellow Papenfuss Laboratory The Walter Eliza Hall Institute of Medical Research +61 (0)466452544 Il giorno ven 7 feb 2020 alle ore 12:12 Michael Lawrence < lawrence.mich...@gene.com> ha scritto: > There's a difference between implementing software, where one wants > formal data structures, and providing a convenient user interface. > Software needs to interface with other software, so a package could > provide both types of interfaces, one based on rich (S4) data > structures, another on simpler structures with an API more amenable to > analysis. I'm not sure the tidyverse is a great answer to the user > interface, because it lacks domain semantics. This is still an active > area of research (see Stuart Lee's plyranges, for example). I hope you > can find a reasonable compromise that enables you to integrate ttBulk > into Bioconductor, so that it can take advantage of the synergies the > ecosystem provides. > > PS: There is no simple fix for your example. > > Michael > > On Thu, Feb 6, 2020 at 4:12 PM stefano <mangiolastef...@gmail.com> wrote: > > > > Thanks a lot for your comment Martin and Michael, > > > > Here I reply to Marti's comment. Michael I will try to implement your > > solution! > > > > I think a key point from > > > https://github.com/Bioconductor/Contributions/issues/1355#issuecomment-580977106 > > (that I was under-looking) is > > > > *>> "So to sum up: if you submit a package to Bioconductor, there is an > > expectation that your package can work seamlessly with other Bioconductor > > packages, and your implementation should support that. The safest and > > easiest way to do that is to use Bioconductor data structures"* > > > > In this case my package would not be suited as I do not use pre-existing > > Bioconductor data structures, but instead i see value in using a simple > > tibble, for the reasons in part explained in the README > > https://github.com/stemangiola/ttBulk (harvesting the power of tidyverse > > and friends for bulk transcriptomic analyses). > > > > *>> "with the minimum standard of being able to accept such objects even > if > > you do not rely on them internally (though you should)"* > > > > With this I can comply in the sense that I can built converters to and > from > > SummarizedExperiment (for example). > > > > * >> "If you don't want to do that, then that's a shame, but it would > > suggest that Bioconductor would not be the right place to host this > > package."* > > > > Well said. > > > > In summary, I do not rely on Bioconductor data structure, as I am > proposing > > another paradigm, but my back end is made of largely Bioconductor > analysis > > packages that I would like to interface with tidyverse. So > > > > 1) Should I build converters to Bioc. data structures, and force the use > of > > S3 object (needed to tiidyverse to work), or > > 2) Submit to CRAN > > > > I don't have strong feeling for either, although I think Bioconductor > would > > be a good fit. Please community give me your honest opinions, I will take > > them seriously and proceed. > > > > > > > > Best wishes. > > > > *Stefano * > > > > > > > > Stefano Mangiola | Postdoctoral fellow > > > > Papenfuss Laboratory > > > > The Walter Eliza Hall Institute of Medical Research > > > > +61 (0)466452544 > > > > > > Il giorno ven 7 feb 2020 alle ore 10:46 Martin Morgan < > > mtmorgan.b...@gmail.com> ha scritto: > > > > > The idea isn't to use S4 at any cost, but to 'play well' with the > > > Bioconductor ecosystem, including writing robust and maintainable code. > > > > > > This comment > > > > https://github.com/Bioconductor/Contributions/issues/1355#issuecomment-580977106 > > > provides some motivation; there was also an interesting exchange on the > > > Bioconductor community slack about this (join at > > > https://bioc-community.herokuapp.com/; discussion starting with > > > https://community-bioc.slack.com/archives/C35G93GJH/p1580144746014800 > ). > > > The plyranges package http://bioconductor.org/packages/plyranges and > > > recently accepted fluentGenomics workflow > > > https://github.com/Bioconductor/Contributions/issues/1350 provide > > > illustrations. > > > > > > In your domain it's really surprising that your package does not use > > > (Import or Depend on) SummarizedExperiment or GenomicRanges packages. > From > > > a superficial look at your package, it seems like something like > > > `reduce_dimensions()` could be defined to take & return a > > > SummarizedExperiment and hence benefit from some of the points in the > > > github issue comment mentioned above. > > > > > > Certainly there is a useful transition, both 'on the way in' to a > > > SummarizedExperiment, and after leaving the more specialized > bioinformatic > > > computations to, e.g., display a pairs plot of the reduced dimensions, > > > where one might re-shape the data to a tidy format and use 'plain old' > > > tibbles; the fluentGenomics workflow might provide some guidance. > > > > > > At the end of the day it would not be surprising for Bioconductor > packages > > > to make use of tidy concepts and data structures, particularly in the > > > vignette, and it would be a mistake for Bioconductor to exclude > > > well-motivated 'tidy' representations. > > > > > > Martin Morgan > > > > > > On 2/6/20, 5:46 PM, "Bioc-devel on behalf of stefano" < > > > bioc-devel-boun...@r-project.org on behalf of > mangiolastef...@gmail.com> > > > wrote: > > > > > > Hello, > > > > > > I have a package (ttBulk) under review. I have been told to replace > > > the S3 > > > system to S4. My package is based on the class tbl_df and must be > fully > > > compatible with tidyverse methods (inheritance). After some tests > and > > > research I understood that tidyverse ecosystem is not compatible > with > > > S4 > > > classes. > > > > > > For example, several methos do not apparently handle S4 objects > based > > > on > > > S3 tbl_df > > > > > > ```library(tidyverse)setOldClass("tbl_df") > > > setClass("test2", contains = "tbl_df") > > > my <- new("test2", tibble(a = 1)) > > > my %>% mutate(b = 3) > > > > > > a b > > > 1 1 3 > > > ``` > > > > > > ```my <- new("test2", tibble(a = rnorm(100), b = 1)) > > > my %>% nest(data = -b) > > > Error: `x` must be a vector, not a `test2` object > > > Run `rlang::last_error()` to see where the error occurred. > > > ``` > > > > > > Could you please advise whether a tidyverse based package can be > > > hosted on > > > Bioconductor, and if S4 classes are really mandatory? I need to > > > understand > > > if I am forced to submit to CRAN instead (although Bioconductor > would > > > be a > > > good fit, sice I try to interface transcriptional analysis tools to > > > tidy > > > universe) > > > > > > > > > Thanks a lot. > > > Stefano > > > > > > [[alternative HTML version deleted]] > > > > > > _______________________________________________ > > > Bioc-devel@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > > > > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioc-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > -- > Michael Lawrence > Senior Scientist, Bioinformatics and Computational Biology > Genentech, A Member of the Roche Group > Office +1 (650) 225-7760 > micha...@gene.com > > Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel