I think the more clean solution for Davide (if he inists on having separate objects; I decided against it in minfi) is to extend the class to allow this.
Kasper On Thu, Jun 18, 2015 at 12:25 AM, Ryan <r...@thompsonclan.org> wrote: > Oh wow, I didn't know you could put a DataFrame into a single column of > another DataFrame. That actually solves a problem for me too (I don't > intend to expose nested DataFrames to the users though). > > > On 6/17/15 7:23 PM, Martin Morgan wrote: > >> On 06/17/2015 11:41 AM, davide risso wrote: >> >>> Dear list, >>> >>> I'm creating an R package to store RNA-seq data of a somewhat large >>> project >>> in which I'm involved. >>> >>> One of the initial goals is to compare different pre-processing >>> pipelines, >>> hence I have multiple expression matrices corresponding to the same >>> samples. >>> The SummarizedExperiment class seems a good candidate, since I have >>> multiple expression matrices with the same rowData and colData >>> information. >>> >>> I have several sample-specific variables that I want to store with the >>> object, namely, experimental information (e.g., batch, date, experimental >>> condition, ...) and sample quality (e.g., proportion of aligned reads, >>> total duplicate reads, etc...). >>> >>> Of course, I can always create one big data frame concatenating the two >>> (experimental info + sample quality), but it seems that both conceptually >>> and practically, it might be useful to have two separate data frames. >>> Since this seems somewhat a reasonably standard type of information that >>> one would want to carry on, I was wondering if it would be possible / >>> useful to allow the user to have multiple data.frames in the colData slot >>> >> >> Actually, colData() is a DataFrame, and a DataFrame column can contain a >> DataFrame. So after >> >> example(SummarizedExperiment) >> >> we could make some faux sample quality data >> >> quality = DataFrame(x=1:6, y=6:1, row.names=colnames(se1)) >> >> add this as a column in the colData() >> >> colData(se1)$quality = quality >> >> (or create the SummarizedExperiment from a similar DataFrame up-front) >> and manage our grouped data >> >> > colData(se1) >> DataFrame with 6 rows and 2 columns >> Treatment quality >> <character> <DataFrame> >> A ChIP ######## >> B Input ######## >> C ChIP ######## >> D Input ######## >> E ChIP ######## >> F Input ######## >> > colData(se1[,1:2])$quality >> DataFrame with 2 rows and 2 columns >> x y >> <integer> <integer> >> A 1 6 >> B 2 5 >> >> I'm not sure that this is any less confusing to the end user than having >> to manage a DataFrameList(), but it does not require any new features. >> >> Martin >> >> of SummarizedExperiment. >>> >>> Best, >>> Davide >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioc-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>> >>> >> >> > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel