Re: [Bioc-devel] New SE or new assay in SE?

2020-01-29 Thread Pages, Herve
Just after I pressed the "Send" button I realized that by returning a 
new SE object you probably meant returning an SE object with only the 
new assay in it. I would favor the other option i.e. 'doProcess(se)' 
adds a new assay to 'se'. I think that's what most workflows based on SE 
objects do.

This doesn't mean that you can't provide a lower-level function that 
returns the transformed data in a "naked" matrix (i.e. not wrapped 
inside an SE). This let's the (more advanced) user decide what they want 
to do with it e.g. they can add it to the original SE:

 assay(se, "normalized") <- normalized_data

or wrap it in its own new SE:

 normalized <- SummarizedExperiment(list(normalized=normalized_data))

H.

On 1/29/20 08:29, Pages, Herve wrote:
> On 1/28/20 01:37, Laurent Gatto wrote:
>> Dear all,
>>
>> Assume we have a SummarizedExperiment object `se` that contains raw count 
>> data, and a method `doProcess` that processes the data to produce a matrix 
>> of identical dimensions (for example log-transformation, normalisation, 
>> imputation, ...). What are the opinions in favour or against the following 
>> two options
>>
>> - `doProcess(se)` returns a new SE object
>> - `doProcess(se)` adds a new assay to se
> 
> Aren't these are the same?
> 
> SE objects are not reference objects i.e. they follow R standard
> copy-on-change semantic. This means that they never get modified **in
> place** (aka they're not "mutable"). So 'doProcess(se)' will always
> return a new object, whatever you do inside the function, that is, even
> if the function modifies 'se' internally e.g. with something like:
> 
> assay(se, "new_assay") <- new_assay
> 
> Note that the assay() setter itself like all setters also produces a new
> object. The parser actually replaces the following code
> 
> assay(se, "new_assay") <- new_assay
> 
> with
> 
> se <- `assay<-`(se, "new_assay", value=new_assay)
> 
> As you can see the previous `se` is replaced with the new one which
> gives the **illusion** of in-place replacement but it's not.
> 
> Hope this helps,
> H.
> 
> 
>>
>> If you are interested about the broader context about this question, see 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_waldronlab_MultiAssayExperiment_issues_266=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Of3qgEC1ElS9Ji3Iu2vNk93_Fj3m50sTV2zT0dyAKvA=qimtz2YygmTlAiYZOWZJrwPMo6eMKy5E5Rew60452TQ=
>>
>> Thank you in advance for your input.
>>
>> Laurent
>>
>>
>>
>>
>> ___
>> Bioc-devel@r-project.org mailing list
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Of3qgEC1ElS9Ji3Iu2vNk93_Fj3m50sTV2zT0dyAKvA=_aXY7azhIr_1UPl2s3RvX1MJp_9Xcw_73w2KOYbqBVI=
>>
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] New SE or new assay in SE?

2020-01-29 Thread Pages, Herve
On 1/28/20 01:37, Laurent Gatto wrote:
> Dear all,
> 
> Assume we have a SummarizedExperiment object `se` that contains raw count 
> data, and a method `doProcess` that processes the data to produce a matrix of 
> identical dimensions (for example log-transformation, normalisation, 
> imputation, ...). What are the opinions in favour or against the following 
> two options
> 
> - `doProcess(se)` returns a new SE object
> - `doProcess(se)` adds a new assay to se

Aren't these are the same?

SE objects are not reference objects i.e. they follow R standard 
copy-on-change semantic. This means that they never get modified **in 
place** (aka they're not "mutable"). So 'doProcess(se)' will always 
return a new object, whatever you do inside the function, that is, even 
if the function modifies 'se' internally e.g. with something like:

   assay(se, "new_assay") <- new_assay

Note that the assay() setter itself like all setters also produces a new 
object. The parser actually replaces the following code

   assay(se, "new_assay") <- new_assay

with

   se <- `assay<-`(se, "new_assay", value=new_assay)

As you can see the previous `se` is replaced with the new one which 
gives the **illusion** of in-place replacement but it's not.

Hope this helps,
H.


> 
> If you are interested about the broader context about this question, see 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_waldronlab_MultiAssayExperiment_issues_266=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Of3qgEC1ElS9Ji3Iu2vNk93_Fj3m50sTV2zT0dyAKvA=qimtz2YygmTlAiYZOWZJrwPMo6eMKy5E5Rew60452TQ=
> 
> Thank you in advance for your input.
> 
> Laurent
> 
> 
> 
> 
> ___
> Bioc-devel@r-project.org mailing list
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Of3qgEC1ElS9Ji3Iu2vNk93_Fj3m50sTV2zT0dyAKvA=_aXY7azhIr_1UPl2s3RvX1MJp_9Xcw_73w2KOYbqBVI=
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] New SE or new assay in SE?

2020-01-28 Thread Tim Triche, Jr.
Assume the object is backed by an HDF5 or Zarr array, for the sake of argument 
and because that’s kind of how it works these days for many people. Also assume 
the “SE*” may not actually be an SE, but rather some wacky subclass of SE. 

If you return a new SE*, you need to copy all the metadata, any weird slots 
that subclasses of SE have added, etc. and you need to make sure you’re 
returning what you think you are. This has been an issue for me sometimes when 
converting SE-like objects to HDF5Array-backed SE-like objects (eg 
GenomicRatioSet). 

If you just add the assay to SE*, you may be writing HDF5 or Zarr to disk for a 
while, but at least you don’t have to care what all else SE* contains or does. 
That’s the subclasser’s problem. If their methods suck, file a PR and let them 
merge it! Meanwhile your functions can call doProcess(SE*) if they notice that 
its output is missing when it ought to be present, and regardless of what SE* 
really is, they ought to work. You could check to see if SE* is some type of 
object for which doProcess() is inappropriate, but on balance, I’d add the 
assay and return SE* as itself. 

--t

> On Jan 28, 2020, at 4:38 AM, Laurent Gatto  wrote:

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] New SE or new assay in SE?

2020-01-28 Thread Laurent Gatto
Dear all,

Assume we have a SummarizedExperiment object `se` that contains raw count data, 
and a method `doProcess` that processes the data to produce a matrix of 
identical dimensions (for example log-transformation, normalisation, 
imputation, ...). What are the opinions in favour or against the following two 
options

- `doProcess(se)` returns a new SE object 
- `doProcess(se)` adds a new assay to se

If you are interested about the broader context about this question, see 
https://github.com/waldronlab/MultiAssayExperiment/issues/266

Thank you in advance for your input.

Laurent




___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel