Re: [Bioc-devel] New SE or new assay in SE?
Just after I pressed the "Send" button I realized that by returning a new SE object you probably meant returning an SE object with only the new assay in it. I would favor the other option i.e. 'doProcess(se)' adds a new assay to 'se'. I think that's what most workflows based on SE objects do. This doesn't mean that you can't provide a lower-level function that returns the transformed data in a "naked" matrix (i.e. not wrapped inside an SE). This let's the (more advanced) user decide what they want to do with it e.g. they can add it to the original SE: assay(se, "normalized") <- normalized_data or wrap it in its own new SE: normalized <- SummarizedExperiment(list(normalized=normalized_data)) H. On 1/29/20 08:29, Pages, Herve wrote: > On 1/28/20 01:37, Laurent Gatto wrote: >> Dear all, >> >> Assume we have a SummarizedExperiment object `se` that contains raw count >> data, and a method `doProcess` that processes the data to produce a matrix >> of identical dimensions (for example log-transformation, normalisation, >> imputation, ...). What are the opinions in favour or against the following >> two options >> >> - `doProcess(se)` returns a new SE object >> - `doProcess(se)` adds a new assay to se > > Aren't these are the same? > > SE objects are not reference objects i.e. they follow R standard > copy-on-change semantic. This means that they never get modified **in > place** (aka they're not "mutable"). So 'doProcess(se)' will always > return a new object, whatever you do inside the function, that is, even > if the function modifies 'se' internally e.g. with something like: > > assay(se, "new_assay") <- new_assay > > Note that the assay() setter itself like all setters also produces a new > object. The parser actually replaces the following code > > assay(se, "new_assay") <- new_assay > > with > > se <- `assay<-`(se, "new_assay", value=new_assay) > > As you can see the previous `se` is replaced with the new one which > gives the **illusion** of in-place replacement but it's not. > > Hope this helps, > H. > > >> >> If you are interested about the broader context about this question, see >> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_waldronlab_MultiAssayExperiment_issues_266&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Of3qgEC1ElS9Ji3Iu2vNk93_Fj3m50sTV2zT0dyAKvA&s=qimtz2YygmTlAiYZOWZJrwPMo6eMKy5E5Rew60452TQ&e= >> >> Thank you in advance for your input. >> >> Laurent >> >> >> >> >> ___ >> Bioc-devel@r-project.org mailing list >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Of3qgEC1ElS9Ji3Iu2vNk93_Fj3m50sTV2zT0dyAKvA&s=_aXY7azhIr_1UPl2s3RvX1MJp_9Xcw_73w2KOYbqBVI&e= >> > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fredhutch.org Phone: (206) 667-5791 Fax:(206) 667-1319 ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] New SE or new assay in SE?
On 1/28/20 01:37, Laurent Gatto wrote: > Dear all, > > Assume we have a SummarizedExperiment object `se` that contains raw count > data, and a method `doProcess` that processes the data to produce a matrix of > identical dimensions (for example log-transformation, normalisation, > imputation, ...). What are the opinions in favour or against the following > two options > > - `doProcess(se)` returns a new SE object > - `doProcess(se)` adds a new assay to se Aren't these are the same? SE objects are not reference objects i.e. they follow R standard copy-on-change semantic. This means that they never get modified **in place** (aka they're not "mutable"). So 'doProcess(se)' will always return a new object, whatever you do inside the function, that is, even if the function modifies 'se' internally e.g. with something like: assay(se, "new_assay") <- new_assay Note that the assay() setter itself like all setters also produces a new object. The parser actually replaces the following code assay(se, "new_assay") <- new_assay with se <- `assay<-`(se, "new_assay", value=new_assay) As you can see the previous `se` is replaced with the new one which gives the **illusion** of in-place replacement but it's not. Hope this helps, H. > > If you are interested about the broader context about this question, see > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_waldronlab_MultiAssayExperiment_issues_266&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Of3qgEC1ElS9Ji3Iu2vNk93_Fj3m50sTV2zT0dyAKvA&s=qimtz2YygmTlAiYZOWZJrwPMo6eMKy5E5Rew60452TQ&e= > > Thank you in advance for your input. > > Laurent > > > > > ___ > Bioc-devel@r-project.org mailing list > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Of3qgEC1ElS9Ji3Iu2vNk93_Fj3m50sTV2zT0dyAKvA&s=_aXY7azhIr_1UPl2s3RvX1MJp_9Xcw_73w2KOYbqBVI&e= > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fredhutch.org Phone: (206) 667-5791 Fax:(206) 667-1319 ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] New SE or new assay in SE?
Assume the object is backed by an HDF5 or Zarr array, for the sake of argument and because that’s kind of how it works these days for many people. Also assume the “SE*” may not actually be an SE, but rather some wacky subclass of SE. If you return a new SE*, you need to copy all the metadata, any weird slots that subclasses of SE have added, etc. and you need to make sure you’re returning what you think you are. This has been an issue for me sometimes when converting SE-like objects to HDF5Array-backed SE-like objects (eg GenomicRatioSet). If you just add the assay to SE*, you may be writing HDF5 or Zarr to disk for a while, but at least you don’t have to care what all else SE* contains or does. That’s the subclasser’s problem. If their methods suck, file a PR and let them merge it! Meanwhile your functions can call doProcess(SE*) if they notice that its output is missing when it ought to be present, and regardless of what SE* really is, they ought to work. You could check to see if SE* is some type of object for which doProcess() is inappropriate, but on balance, I’d add the assay and return SE* as itself. --t > On Jan 28, 2020, at 4:38 AM, Laurent Gatto wrote: ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
[Bioc-devel] New SE or new assay in SE?
Dear all, Assume we have a SummarizedExperiment object `se` that contains raw count data, and a method `doProcess` that processes the data to produce a matrix of identical dimensions (for example log-transformation, normalisation, imputation, ...). What are the opinions in favour or against the following two options - `doProcess(se)` returns a new SE object - `doProcess(se)` adds a new assay to se If you are interested about the broader context about this question, see https://github.com/waldronlab/MultiAssayExperiment/issues/266 Thank you in advance for your input. Laurent ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel