On 03/03/2016 10:49 AM, Peter Hickey wrote:
Hi Herve,

I agree, the abind::abind() signature is rather verbose and much of it is not
required in the context of a SummarizedExperiment. Perhaps "overriding"
abind::abind() with an S4 generic with a different signature isn't a good idea
and it would be better to have our own generic.

I quite like arbind() and acbind() as names. I guess these would live in the
SummarizedExperiment package?

Yes.


Happy to do further work on this but I won't have time until the weekend or
next week.

Thanks for offering. No rush.

H.


Cheers,
Pete

On Thu, 3 Mar 2016 at 13:31 Hervé Pagès <hpa...@fredhutch.org> wrote:

Hi Pete,

On 03/02/2016 12:42 PM, Peter Hickey wrote:
This is mostly directed to Herve and/or Martin, but I'd be interested
in other's input too.

The SummarizedExperiment package defines rbind,Assays-method and
cbind,Assays-method that are called when rbind() or cbind() is called
on a SummarizedExperiment object. In the case of two-dimensional assay
(matrix) these work much as if rbind/cbind were called on the matrix:

library(SummarizedExperiment)
m <- matrix(rnorm(100), nrow = 4, ncol = 25)
se1 <- SummarizedExperiment(m)
dim(assay(rbind(se1, se1)))
[1]  8 25
dim(rbind(assay(se1), assay(se1)))
[1]  8 25
dim(assay(cbind(se1, se1)))
[1]  4 50
dim(cbind(assay(se1), assay(se1)))
[1]  4 50

When an assay is an array with more than 2 dimensions, however, the
result of the rbind,Assay-method (resp. cbind,Assays-method) differs
from the rbind,array-method (resp. cbind,array-method). This is for a
good reason because it preserves the dimensionality of the assay in
the SummarizedExperiment object. So in fact the "rbind(...)" of the
assay is more like abind::abind(..., along = 1) and the "cbind(...)"
of the assay is more like abind::abind(..., along = 2):

x <- array(rnorm(100), dim = c(4, 5, 5))
se2 <- SummarizedExperiment(x)
dim(assay(rbind(se2, se2)))
[1] 8 5 5
dim(rbind(assay(se2), assay(se2)))
[1]   2 100
dim(abind::abind(assay(se2), assay(se2), along = 1))
[1] 8 5 5
identical(assay(rbind(se2, se2)), abind::abind(assay(se2), assay(se2), along = 
1))
[1] TRUE
dim(assay(cbind(se2, se2)))
[1]  4 10  5
dim(cbind(assay(se2), assay(se2)))
[1] 100   2
dim(abind::abind(assay(se2), assay(se2), along = 2))
[1]  4 10  5
identical(assay(cbind(se2, se2)), abind::abind(assay(se2), assay(se2), along = 
2))
[1] TRUE

rbind/cbind does not work for other "array-like" objects with > 2
dimensions in the assays slot of a SummarizedExperiment because the
internal function SummarizedExperiment:::.bind_assay_elements()
constructs a new array via array() if the assay has more than 2
dimensions, thus destroying the original class of the array-like
object.

What I'm wondering is whether there is a way to generalise rbind/cbind
of Assays to other array-like objects provided that have a suitable
method defined. It seems to me that a good candidate would be to
require that an object in the assays slot has an abind(..., along = 1)
and abind(..., along = 2) method defined if it has more than 2
dimensions. It might even be worth using abind::abind() for when the
assay is an array with more than 2 dimensions to simplify the code
somewhat.

Thoughts? I'd be happy to work on a patch.

Requiring that abind(..., along=1) and abind(..., along=2) work on
assays of dim > 2 would work. Note that abind() has a complicated
signature (many extra arguments) but the "abind" methods that one
would need to implement wouldn't need to satisfy the full abind()
contract (in the context of SummarizedExperiment assays, satisfying
the full contract is not needed and would be too much work).

Alternatively we can introduce our own generics for that e.g.
abind1() and abind2(), or arbind() and acbind() (for "assay rbind"
and "assay cbind"). Advantages: the signatures would be cleaner,
the contracts simpler, and the methods easier to implement. Also
we wouldn't need to depend on the abind package.

What do you think?

H.


Cheers,
Pete

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to