Re: [Bioc-devel] exptData(SummarizedExperiment)

2015-05-12 Thread Hervé Pagès

SummarizedExperiment was just an example. I agree it can be a
little challenging for end users to know where to find a particular
functionality but I'm not sure about using meta packages to address
that. At least I feel we should probably avoid creating new meta
packages out of the blue, with arbitrary limits and possibly endless
discussions about what exactly goes in them. Also I don't think there
is a single core but rather several domain-specific cores.

What about using the existing workflow packages instead?
A workflow package (like the variants package here
http://bioconductor.org/help/workflows/variants/)
covers a specific domain and loading it should load the core
for that domain. Plus the user gets a great vignette as a bonus
to get started so it's not just an empty shell.

There are probably some shortcomings with workflow packages
that would need to be addressed before they can serve as
convenient meta packages though e.g. they're treated too
differently from other BioC packages (e.g. they're not available
via biocLite() and don't show up under the biocViews tree here
http://bioconductor.org/packages/release/BiocViews.html).
Nothing that seems impossible to address though...

H.


On 05/12/2015 03:22 PM, Michael Lawrence wrote:

It's more general than SummarizedExperiment. I think people would
appreciate a simple way to load the core, without having to remember,
for example, that VCF reading is in VariantAnnotation.

On Mon, May 11, 2015 at 9:51 PM, Hervé Pagès hpa...@fredhutch.org
mailto:hpa...@fredhutch.org wrote:

Hi Michael,

On 05/11/2015 05:35 PM, Michael Lawrence wrote:

Splitting stuff into different packages is good for modularity, but
tough on the mind of the user. What about having some sort of meta
package that simply loads the core infrastructure packages? Named
something simple like Genomics or GenomicsCore.


Don't know if we need this. For example, for all the
SummarizedExperiment use cases I ran into, the end-user generally
only needs to load the corresponding high-level package (DESeq2,
VariantAnnotation, minfi, GenomicAlignments, etc...) and that takes
care of loading all the low-level infrastructure packages.

H.


On Mon, May 11, 2015 at 5:10 PM, Hervé Pagès
hpa...@fredhutch.org mailto:hpa...@fredhutch.org
mailto:hpa...@fredhutch.org mailto:hpa...@fredhutch.org wrote:

 Hi Tim,

 The SummarizedExperiment class is being replaced with the
 RangedSummarizedExperiment class from the new
SummarizedExperiment
 package. This is a work-in-progress and the name and internal
 representation of the RangedSummarizedExperiment class are not
 finalized yet. The main goal for now is to move all the
 SummarizedExperiment stuff from GenomicRanges to its own
package.

 Anyway, metadata() is the replacement for exptData() on
 RangedSummarizedExperiment objects. It's on my list to add
 an exptData method for backward compatibility.

 Cheers,
 H.


 On 05/11/2015 04:37 PM, Tim Triche, Jr. wrote:

 who determined that breaking this would be a good idea?!?

 R ?SummarizedExperiment
 Help on topic 'SummarizedExperiment' was found in the
following
 packages:

 Package   Library
 GenomicRanges
   /home/tim/R/x86_64-pc-linux-gnu-library/3.2
 SummarizedExperiment

/home/tim/R/x86_64-pc-linux-gnu-library/3.2

R nrows - 200; ncols - 6
 Rcounts - matrix(runif(nrows * ncols, 1,
1e4), nrows)
 RrowRanges - GRanges(rep(c(chr1, chr2),
c(50, 150)),
 +   IRanges(floor(runif(200,
1e5, 1e6)),
 width=100),
 +   strand=sample(c(+, -),
200, TRUE))
 RcolData - DataFrame(Treatment=rep(c(ChIP,
Input), 3),
 + row.names=LETTERS[1:6])
 Rsset -
 SummarizedExperiment(assays=SimpleList(counts=counts),
 +   rowRanges=rowRanges,
colData=colData)
 Rsset
 class: RangedSummarizedExperiment
 dim: 200 6
 metadata(0):
 assays(1): counts
 rownames: NULL
 rowRanges metadata column names(0):
 colnames(6): A B ... E F
 colData names(1): Treatment
 RassayNames(sset)
 [1] counts
 Rassays(sset) - endoapply(assays(sset), asinh)
 R

Re: [Bioc-devel] exptData(SummarizedExperiment)

2015-05-12 Thread Michael Lawrence
It's more general than SummarizedExperiment. I think people would
appreciate a simple way to load the core, without having to remember, for
example, that VCF reading is in VariantAnnotation.

On Mon, May 11, 2015 at 9:51 PM, Hervé Pagès hpa...@fredhutch.org wrote:

 Hi Michael,

 On 05/11/2015 05:35 PM, Michael Lawrence wrote:

 Splitting stuff into different packages is good for modularity, but
 tough on the mind of the user. What about having some sort of meta
 package that simply loads the core infrastructure packages? Named
 something simple like Genomics or GenomicsCore.


 Don't know if we need this. For example, for all the
 SummarizedExperiment use cases I ran into, the end-user generally
 only needs to load the corresponding high-level package (DESeq2,
 VariantAnnotation, minfi, GenomicAlignments, etc...) and that takes
 care of loading all the low-level infrastructure packages.

 H.


 On Mon, May 11, 2015 at 5:10 PM, Hervé Pagès hpa...@fredhutch.org
 mailto:hpa...@fredhutch.org wrote:

 Hi Tim,

 The SummarizedExperiment class is being replaced with the
 RangedSummarizedExperiment class from the new SummarizedExperiment
 package. This is a work-in-progress and the name and internal
 representation of the RangedSummarizedExperiment class are not
 finalized yet. The main goal for now is to move all the
 SummarizedExperiment stuff from GenomicRanges to its own package.

 Anyway, metadata() is the replacement for exptData() on
 RangedSummarizedExperiment objects. It's on my list to add
 an exptData method for backward compatibility.

 Cheers,
 H.


 On 05/11/2015 04:37 PM, Tim Triche, Jr. wrote:

 who determined that breaking this would be a good idea?!?

 R ?SummarizedExperiment
 Help on topic 'SummarizedExperiment' was found in the following
 packages:

 Package   Library
 GenomicRanges
   /home/tim/R/x86_64-pc-linux-gnu-library/3.2
 SummarizedExperiment

 /home/tim/R/x86_64-pc-linux-gnu-library/3.2

R nrows - 200; ncols - 6
 Rcounts - matrix(runif(nrows * ncols, 1, 1e4), nrows)
 RrowRanges - GRanges(rep(c(chr1, chr2), c(50, 150)),
 +   IRanges(floor(runif(200, 1e5, 1e6)),
 width=100),
 +   strand=sample(c(+, -), 200, TRUE))
 RcolData - DataFrame(Treatment=rep(c(ChIP, Input),
 3),
 + row.names=LETTERS[1:6])
 Rsset -
 SummarizedExperiment(assays=SimpleList(counts=counts),
 +   rowRanges=rowRanges, colData=colData)
 Rsset
 class: RangedSummarizedExperiment
 dim: 200 6
 metadata(0):
 assays(1): counts
 rownames: NULL
 rowRanges metadata column names(0):
 colnames(6): A B ... E F
 colData names(1): Treatment
 RassayNames(sset)
 [1] counts
 Rassays(sset) - endoapply(assays(sset), asinh)
 Rhead(assay(sset))
   ABCDEF
 [1,] 6.89 8.81 9.46 9.20 8.88 9.44
 [2,] 5.07 9.70 4.08 7.47 8.91 5.64
 [3,] 9.88 9.84 8.95 9.07 9.86 9.06
 [4,] 9.89 8.88 8.92 8.05 8.46 9.51
 [5,] 9.75 8.48 4.73 9.86 8.43 9.86
 [6,] 9.29 9.13 9.80 9.77 9.50 8.40
 R exptData(sset)
 Error in (function (classes, fdef, mtable)  :
 unable to find an inherited method for function 'exptData'
 for signature
 'RangedSummarizedExperiment'



 It's one of those things that's a handy place to put data when
 you need to
 carry it around for the same set of people/subjects but don't
 have a handy
 multidimensional container for it.  So it's a bit of a drag that
 it now
 breaks...


 Bonus:

 R ?exptData,SummarizedExperiment-method

 SummarizedExperiment-classpackage:GenomicRangesR
 Documentation

 SummarizedExperiment instances

 Description:

The SummarizedExperiment class is a matrix-like container
 where
rows represent ranges of interest (as a 'GRanges or
GRangesList-class') and columns represent samples (with
 sample
data summarized as a 'DataFrame-class'). A
 'SummarizedExperiment'
contains one or more assays, each represented by a
 matrix-like
object of numeric or other mode.




 R sessionInfo()
 R version 3.2.0 (2015-04-16)
 Platform: x86_64-pc-linux-gnu (64-bit)
 Running under: Ubuntu 15.04

 locale:
[1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
[5] 

Re: [Bioc-devel] exptData(SummarizedExperiment)

2015-05-12 Thread Vincent Carey
Agreed that the workflow vehicle should get more attention.  Do all
workflows correspond to packages?

On Tue, May 12, 2015 at 7:31 PM, Michael Lawrence lawrence.mich...@gene.com
 wrote:

 I like the idea of having multiple, domain-specific cores. Those could also
 serve as a vehicle for high-level documentation, including the workflows
 but also more cheat-sheet and/or cookbook-style documentation. Rafa has
 brought this up on the phone calls.


 On Tue, May 12, 2015 at 4:10 PM, Hervé Pagès hpa...@fredhutch.org wrote:

  SummarizedExperiment was just an example. I agree it can be a
  little challenging for end users to know where to find a particular
  functionality but I'm not sure about using meta packages to address
  that. At least I feel we should probably avoid creating new meta
  packages out of the blue, with arbitrary limits and possibly endless
  discussions about what exactly goes in them. Also I don't think there
  is a single core but rather several domain-specific cores.
 
  What about using the existing workflow packages instead?
  A workflow package (like the variants package here
  http://bioconductor.org/help/workflows/variants/)
  covers a specific domain and loading it should load the core
  for that domain. Plus the user gets a great vignette as a bonus
  to get started so it's not just an empty shell.
 
  There are probably some shortcomings with workflow packages
  that would need to be addressed before they can serve as
  convenient meta packages though e.g. they're treated too
  differently from other BioC packages (e.g. they're not available
  via biocLite() and don't show up under the biocViews tree here
  http://bioconductor.org/packages/release/BiocViews.html).
  Nothing that seems impossible to address though...
 
  H.
 
 
  On 05/12/2015 03:22 PM, Michael Lawrence wrote:
 
  It's more general than SummarizedExperiment. I think people would
  appreciate a simple way to load the core, without having to remember,
  for example, that VCF reading is in VariantAnnotation.
 
  On Mon, May 11, 2015 at 9:51 PM, Hervé Pagès hpa...@fredhutch.org
  mailto:hpa...@fredhutch.org wrote:
 
  Hi Michael,
 
  On 05/11/2015 05:35 PM, Michael Lawrence wrote:
 
  Splitting stuff into different packages is good for modularity,
  but
  tough on the mind of the user. What about having some sort of
  meta
  package that simply loads the core infrastructure packages?
 Named
  something simple like Genomics or GenomicsCore.
 
 
  Don't know if we need this. For example, for all the
  SummarizedExperiment use cases I ran into, the end-user generally
  only needs to load the corresponding high-level package (DESeq2,
  VariantAnnotation, minfi, GenomicAlignments, etc...) and that takes
  care of loading all the low-level infrastructure packages.
 
  H.
 
 
  On Mon, May 11, 2015 at 5:10 PM, Hervé Pagès
  hpa...@fredhutch.org mailto:hpa...@fredhutch.org
  mailto:hpa...@fredhutch.org mailto:hpa...@fredhutch.org
  wrote:
 
   Hi Tim,
 
   The SummarizedExperiment class is being replaced with the
   RangedSummarizedExperiment class from the new
  SummarizedExperiment
   package. This is a work-in-progress and the name and
 internal
   representation of the RangedSummarizedExperiment class are
  not
   finalized yet. The main goal for now is to move all the
   SummarizedExperiment stuff from GenomicRanges to its own
  package.
 
   Anyway, metadata() is the replacement for exptData() on
   RangedSummarizedExperiment objects. It's on my list to add
   an exptData method for backward compatibility.
 
   Cheers,
   H.
 
 
   On 05/11/2015 04:37 PM, Tim Triche, Jr. wrote:
 
   who determined that breaking this would be a good
 idea?!?
 
   R ?SummarizedExperiment
   Help on topic 'SummarizedExperiment' was found in the
  following
   packages:
 
   Package   Library
   GenomicRanges
 /home/tim/R/x86_64-pc-linux-gnu-library/3.2
   SummarizedExperiment
 
  /home/tim/R/x86_64-pc-linux-gnu-library/3.2
 
  R nrows - 200; ncols - 6
   Rcounts - matrix(runif(nrows * ncols, 1,
  1e4), nrows)
   RrowRanges - GRanges(rep(c(chr1, chr2),
  c(50, 150)),
   +   IRanges(floor(runif(200,
  1e5, 1e6)),
   width=100),
   +   strand=sample(c(+, -),
  200, TRUE))
   RcolData - DataFrame(Treatment=rep(c(ChIP,
  Input), 3),
   +