Re: [Bioc-devel] exptData(SummarizedExperiment)
SummarizedExperiment was just an example. I agree it can be a little challenging for end users to know where to find a particular functionality but I'm not sure about using meta packages to address that. At least I feel we should probably avoid creating new meta packages out of the blue, with arbitrary limits and possibly endless discussions about what exactly goes in them. Also I don't think there is a single core but rather several domain-specific cores. What about using the existing workflow packages instead? A workflow package (like the variants package here http://bioconductor.org/help/workflows/variants/) covers a specific domain and loading it should load the core for that domain. Plus the user gets a great vignette as a bonus to get started so it's not just an empty shell. There are probably some shortcomings with workflow packages that would need to be addressed before they can serve as convenient meta packages though e.g. they're treated too differently from other BioC packages (e.g. they're not available via biocLite() and don't show up under the biocViews tree here http://bioconductor.org/packages/release/BiocViews.html). Nothing that seems impossible to address though... H. On 05/12/2015 03:22 PM, Michael Lawrence wrote: It's more general than SummarizedExperiment. I think people would appreciate a simple way to load the core, without having to remember, for example, that VCF reading is in VariantAnnotation. On Mon, May 11, 2015 at 9:51 PM, Hervé Pagès hpa...@fredhutch.org mailto:hpa...@fredhutch.org wrote: Hi Michael, On 05/11/2015 05:35 PM, Michael Lawrence wrote: Splitting stuff into different packages is good for modularity, but tough on the mind of the user. What about having some sort of meta package that simply loads the core infrastructure packages? Named something simple like Genomics or GenomicsCore. Don't know if we need this. For example, for all the SummarizedExperiment use cases I ran into, the end-user generally only needs to load the corresponding high-level package (DESeq2, VariantAnnotation, minfi, GenomicAlignments, etc...) and that takes care of loading all the low-level infrastructure packages. H. On Mon, May 11, 2015 at 5:10 PM, Hervé Pagès hpa...@fredhutch.org mailto:hpa...@fredhutch.org mailto:hpa...@fredhutch.org mailto:hpa...@fredhutch.org wrote: Hi Tim, The SummarizedExperiment class is being replaced with the RangedSummarizedExperiment class from the new SummarizedExperiment package. This is a work-in-progress and the name and internal representation of the RangedSummarizedExperiment class are not finalized yet. The main goal for now is to move all the SummarizedExperiment stuff from GenomicRanges to its own package. Anyway, metadata() is the replacement for exptData() on RangedSummarizedExperiment objects. It's on my list to add an exptData method for backward compatibility. Cheers, H. On 05/11/2015 04:37 PM, Tim Triche, Jr. wrote: who determined that breaking this would be a good idea?!? R ?SummarizedExperiment Help on topic 'SummarizedExperiment' was found in the following packages: Package Library GenomicRanges /home/tim/R/x86_64-pc-linux-gnu-library/3.2 SummarizedExperiment /home/tim/R/x86_64-pc-linux-gnu-library/3.2 R nrows - 200; ncols - 6 Rcounts - matrix(runif(nrows * ncols, 1, 1e4), nrows) RrowRanges - GRanges(rep(c(chr1, chr2), c(50, 150)), + IRanges(floor(runif(200, 1e5, 1e6)), width=100), + strand=sample(c(+, -), 200, TRUE)) RcolData - DataFrame(Treatment=rep(c(ChIP, Input), 3), + row.names=LETTERS[1:6]) Rsset - SummarizedExperiment(assays=SimpleList(counts=counts), + rowRanges=rowRanges, colData=colData) Rsset class: RangedSummarizedExperiment dim: 200 6 metadata(0): assays(1): counts rownames: NULL rowRanges metadata column names(0): colnames(6): A B ... E F colData names(1): Treatment RassayNames(sset) [1] counts Rassays(sset) - endoapply(assays(sset), asinh) R
Re: [Bioc-devel] exptData(SummarizedExperiment)
It's more general than SummarizedExperiment. I think people would appreciate a simple way to load the core, without having to remember, for example, that VCF reading is in VariantAnnotation. On Mon, May 11, 2015 at 9:51 PM, Hervé Pagès hpa...@fredhutch.org wrote: Hi Michael, On 05/11/2015 05:35 PM, Michael Lawrence wrote: Splitting stuff into different packages is good for modularity, but tough on the mind of the user. What about having some sort of meta package that simply loads the core infrastructure packages? Named something simple like Genomics or GenomicsCore. Don't know if we need this. For example, for all the SummarizedExperiment use cases I ran into, the end-user generally only needs to load the corresponding high-level package (DESeq2, VariantAnnotation, minfi, GenomicAlignments, etc...) and that takes care of loading all the low-level infrastructure packages. H. On Mon, May 11, 2015 at 5:10 PM, Hervé Pagès hpa...@fredhutch.org mailto:hpa...@fredhutch.org wrote: Hi Tim, The SummarizedExperiment class is being replaced with the RangedSummarizedExperiment class from the new SummarizedExperiment package. This is a work-in-progress and the name and internal representation of the RangedSummarizedExperiment class are not finalized yet. The main goal for now is to move all the SummarizedExperiment stuff from GenomicRanges to its own package. Anyway, metadata() is the replacement for exptData() on RangedSummarizedExperiment objects. It's on my list to add an exptData method for backward compatibility. Cheers, H. On 05/11/2015 04:37 PM, Tim Triche, Jr. wrote: who determined that breaking this would be a good idea?!? R ?SummarizedExperiment Help on topic 'SummarizedExperiment' was found in the following packages: Package Library GenomicRanges /home/tim/R/x86_64-pc-linux-gnu-library/3.2 SummarizedExperiment /home/tim/R/x86_64-pc-linux-gnu-library/3.2 R nrows - 200; ncols - 6 Rcounts - matrix(runif(nrows * ncols, 1, 1e4), nrows) RrowRanges - GRanges(rep(c(chr1, chr2), c(50, 150)), + IRanges(floor(runif(200, 1e5, 1e6)), width=100), + strand=sample(c(+, -), 200, TRUE)) RcolData - DataFrame(Treatment=rep(c(ChIP, Input), 3), + row.names=LETTERS[1:6]) Rsset - SummarizedExperiment(assays=SimpleList(counts=counts), + rowRanges=rowRanges, colData=colData) Rsset class: RangedSummarizedExperiment dim: 200 6 metadata(0): assays(1): counts rownames: NULL rowRanges metadata column names(0): colnames(6): A B ... E F colData names(1): Treatment RassayNames(sset) [1] counts Rassays(sset) - endoapply(assays(sset), asinh) Rhead(assay(sset)) ABCDEF [1,] 6.89 8.81 9.46 9.20 8.88 9.44 [2,] 5.07 9.70 4.08 7.47 8.91 5.64 [3,] 9.88 9.84 8.95 9.07 9.86 9.06 [4,] 9.89 8.88 8.92 8.05 8.46 9.51 [5,] 9.75 8.48 4.73 9.86 8.43 9.86 [6,] 9.29 9.13 9.80 9.77 9.50 8.40 R exptData(sset) Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'exptData' for signature 'RangedSummarizedExperiment' It's one of those things that's a handy place to put data when you need to carry it around for the same set of people/subjects but don't have a handy multidimensional container for it. So it's a bit of a drag that it now breaks... Bonus: R ?exptData,SummarizedExperiment-method SummarizedExperiment-classpackage:GenomicRangesR Documentation SummarizedExperiment instances Description: The SummarizedExperiment class is a matrix-like container where rows represent ranges of interest (as a 'GRanges or GRangesList-class') and columns represent samples (with sample data summarized as a 'DataFrame-class'). A 'SummarizedExperiment' contains one or more assays, each represented by a matrix-like object of numeric or other mode. R sessionInfo() R version 3.2.0 (2015-04-16) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 15.04 locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5]
Re: [Bioc-devel] exptData(SummarizedExperiment)
Agreed that the workflow vehicle should get more attention. Do all workflows correspond to packages? On Tue, May 12, 2015 at 7:31 PM, Michael Lawrence lawrence.mich...@gene.com wrote: I like the idea of having multiple, domain-specific cores. Those could also serve as a vehicle for high-level documentation, including the workflows but also more cheat-sheet and/or cookbook-style documentation. Rafa has brought this up on the phone calls. On Tue, May 12, 2015 at 4:10 PM, Hervé Pagès hpa...@fredhutch.org wrote: SummarizedExperiment was just an example. I agree it can be a little challenging for end users to know where to find a particular functionality but I'm not sure about using meta packages to address that. At least I feel we should probably avoid creating new meta packages out of the blue, with arbitrary limits and possibly endless discussions about what exactly goes in them. Also I don't think there is a single core but rather several domain-specific cores. What about using the existing workflow packages instead? A workflow package (like the variants package here http://bioconductor.org/help/workflows/variants/) covers a specific domain and loading it should load the core for that domain. Plus the user gets a great vignette as a bonus to get started so it's not just an empty shell. There are probably some shortcomings with workflow packages that would need to be addressed before they can serve as convenient meta packages though e.g. they're treated too differently from other BioC packages (e.g. they're not available via biocLite() and don't show up under the biocViews tree here http://bioconductor.org/packages/release/BiocViews.html). Nothing that seems impossible to address though... H. On 05/12/2015 03:22 PM, Michael Lawrence wrote: It's more general than SummarizedExperiment. I think people would appreciate a simple way to load the core, without having to remember, for example, that VCF reading is in VariantAnnotation. On Mon, May 11, 2015 at 9:51 PM, Hervé Pagès hpa...@fredhutch.org mailto:hpa...@fredhutch.org wrote: Hi Michael, On 05/11/2015 05:35 PM, Michael Lawrence wrote: Splitting stuff into different packages is good for modularity, but tough on the mind of the user. What about having some sort of meta package that simply loads the core infrastructure packages? Named something simple like Genomics or GenomicsCore. Don't know if we need this. For example, for all the SummarizedExperiment use cases I ran into, the end-user generally only needs to load the corresponding high-level package (DESeq2, VariantAnnotation, minfi, GenomicAlignments, etc...) and that takes care of loading all the low-level infrastructure packages. H. On Mon, May 11, 2015 at 5:10 PM, Hervé Pagès hpa...@fredhutch.org mailto:hpa...@fredhutch.org mailto:hpa...@fredhutch.org mailto:hpa...@fredhutch.org wrote: Hi Tim, The SummarizedExperiment class is being replaced with the RangedSummarizedExperiment class from the new SummarizedExperiment package. This is a work-in-progress and the name and internal representation of the RangedSummarizedExperiment class are not finalized yet. The main goal for now is to move all the SummarizedExperiment stuff from GenomicRanges to its own package. Anyway, metadata() is the replacement for exptData() on RangedSummarizedExperiment objects. It's on my list to add an exptData method for backward compatibility. Cheers, H. On 05/11/2015 04:37 PM, Tim Triche, Jr. wrote: who determined that breaking this would be a good idea?!? R ?SummarizedExperiment Help on topic 'SummarizedExperiment' was found in the following packages: Package Library GenomicRanges /home/tim/R/x86_64-pc-linux-gnu-library/3.2 SummarizedExperiment /home/tim/R/x86_64-pc-linux-gnu-library/3.2 R nrows - 200; ncols - 6 Rcounts - matrix(runif(nrows * ncols, 1, 1e4), nrows) RrowRanges - GRanges(rep(c(chr1, chr2), c(50, 150)), + IRanges(floor(runif(200, 1e5, 1e6)), width=100), + strand=sample(c(+, -), 200, TRUE)) RcolData - DataFrame(Treatment=rep(c(ChIP, Input), 3), +