On Tue, Sep 9, 2014 at 12:30 AM, Hervé Pagès <hpa...@fhcrc.org> wrote:
> On 09/08/2014 06:42 PM, Michael Lawrence wrote: > >> Instead of printing out multiple lines of a table that is rarely of >> interest, could we develop Peter's idea toward something like: >> >> hg19:chr1 hg19:chr2 ... >> [lengths ...] >> >> Not sure what condensed notation would be useful for circularity. >> > > I don't know either. I'm worried that this would make the seqinfo > stuff look like a named vector and that the user would expect > hg19:chr1, hg19:chr2, etc... to be valid names. > > With the table-like layout, some screen real estate can always be > saved by printing less lines: > > What I had in mind was > > gr > GRanges with 3 ranges and 0 metadata columns: > genome: hg19 > seqnames ranges strand > <Rle> <IRanges> <Rle> > [1] chr14 [19069583, 19069654] + > [2] chr14 [19363738, 19363809] + > [3] chr14 [19363755, 19363826] - > [4] chr14 [19369799, 19369870] + > you could then probably dispense with the seqlengths. i have never found them too useful except as a key to the genome. if there are multiple genomes, we have something like genomes: hg19, mm9 the point is to make it prominent, particularly at a time of transition. > --- seqinfo: 60 seqlevels (2 circulars) on 2 genomes (hg19, mm10) --- > seqlevels seqlengths isCircular genome > chr1 249250621 <NA> hg19 > chr10 135534747 <NA> hg19 > ... ... ... ... > chrX 155270560 <NA> hg19 > chrY 59373566 <NA> hg19 > > I agree that the exact content of the seqinfo table itself is rarely > of interest so printing only 3 or 4 lines is OK. IMO it's important > to make the user aware of the existence of this hidden table and to > display it like what it really is (i.e. a table). Also displaying the > column names is a well established tradition and serves the purpose > of providing a quick summary of the accessors that are available to > access those fields. > > H. > > >> >> On Mon, Sep 8, 2014 at 5:21 PM, Peter Hickey <hic...@wehi.edu.au >> <mailto:hic...@wehi.edu.au>> wrote: >> >> Perhaps it might be useful to have some way of highlighting if any >> of the chromosomes are circular or highlighting if there are >> multiple genomes present? Otherwise this information might be hidden >> in the "…" >> >> Cheers, >> Pete >> >> >> On 09/09/2014, at 9:44 AM, Hervé Pagès <hpa...@fhcrc.org >> <mailto:hpa...@fhcrc.org>> wrote: >> >> > On 09/08/2014 02:28 PM, Peter Hickey wrote: >> >> Just a vote for still allowing for multiple genomes in a Seqinfo >> object (in a GRanges object). My use case is in bisulfite-sequencing >> experiments where there is often a spike-in of a lambda phage genome >> along with the genome of interest (human or mouse). It's often >> useful to keep all data from a single library together in the same >> objet but process according to genome(x) for each seqlevel. >> > >> > Note taken. Thanks Pete! It's always great to know about concrete >> use >> > cases. >> > >> >> >> >> FWIW, I like Vincent's proposal of selectSome(unique(genome(x))) >> in the show method. >> > >> > Or what about displaying the genome next to the seqlevel it's >> > associated with? Like e.g.: >> > >> > > gr >> > GRanges with 3 ranges and 0 metadata columns: >> > seqnames ranges strand >> > <Rle> <IRanges> <Rle> >> > [1] chr14 [19069583, 19069654] + >> > [2] chr14 [19363738, 19363809] + >> > [3] chr14 [19363755, 19363826] - >> > [4] chr14 [19369799, 19369870] + >> > --- >> > seqinfo: >> > seqlevels seqlengths isCircular genome >> > chr1 249250621 <NA> hg19 >> > chr10 135534747 <NA> hg19 >> > chr11 135006516 <NA> hg19 >> > ... ... ... ... >> > chrUn_gl000249 38502 <NA> hg19 >> > chrX 155270560 <NA> hg19 >> > chrY 59373566 <NA> hg19 >> > >> > That way, we also raise awareness about the isCircular field. >> > The current choice to only display the seqlengths pre-dates the >> > existence of the seqinfo slot but might be a little bit misleading >> > those days since it only exposes some arbitrary seqinfo fields. >> > >> > H. >> > >> >> >> >> Cheers, >> >> Pete >> >> >> >> >> >>> I might have requested the genome annotation, but I'm pretty >> sure it wasn't >> >>> me who decide on tracking it on a per-sequence basis. I could >> imagine use >> >>> cases for that though, e.g., when diagnosing sequencing >> contamination (like >> >>> human vs. mouse). But most other tools and file formats expect >> a single >> >>> genome per "track", so, for example, rtracklayer has an >> internal function >> >>> singleGenome() to take care of this. >> >>> >> >>> On Mon, Sep 8, 2014 at 10:50 AM, Herv? Pag?s <hpa...@fhcrc.org >> <mailto:hpa...@fhcrc.org>> wrote: >> >>> >> >>>> Hi Vince, >> >>>> >> >>>> Yes it would make sense to have the "show" method report the >> genome >> >>>> when genome(x) contains a unique non-NA value. I think the main >> >>>> use case for having the genome defined at the sequence level >> instead >> >>>> of the whole object level is metagenomics. Maybe Michael has >> some other >> >>>> good use cases to share since IIRC he requested the addition >> of the >> >>>> genome field a couple of years ago and made the case for having >> it >> >>>> defined at the sequence level. >> >>>> >> >>>> Cheers, >> >>>> H. >> >>>> >> >>>> >> >>>> On 09/08/2014 07:21 AM, Vincent Carey wrote: >> >>>> >> >>>>> For GRanges x, my naive expectation is that genome(x) returns >> a length- >> >>>>> >> >>>>> one tag identifying the genome to which chromosomal coordinates >> >>>>> >> >>>>> correspond. The genome() method seems to have >> sequence-specific >> >>>>> >> >>>>> semantics, which makes sense, but when we identify sequence >> >>>>> >> >>>>> with chromosome, it seems too complicated. Is there a use >> case for >> >>>>> >> >>>>> a GRanges with sequences from several different genomes? >> >>>>> >> >>>>> >> >>>>> One reason I am inquiring is that I feel it would be nice to >> have the >> >>>>> GRanges show() method report, prominently, the genome in use >> (or NA >> >>>>> >> >>>>> if unspecified). This could be accomplished by reporting >> >>>>> unique(genome(x)), and perhaps that would be satisfactory. >> >>>>> >> >>>>> after example(genome) : >> >>>>> >> >>>>> seqinfo(txdb) >> >>>>>> >> >>>>> >> >>>>> Seqinfo of length 15 >> >>>>> >> >>>>> seqnames seqlengths isCircular genome >> >>>>> >> >>>>> CH2L 23011544 FALSE dm3 >> >>>>> >> >>>>> CH2R 21146708 FALSE dm3 >> >>>>> >> >>>>> CH3L 24543557 FALSE dm3 >> >>>>> >> >>>>> CH3R 27905053 FALSE dm3 >> >>>>> >> >>>>> CH4 1351857 FALSE dm3 >> >>>>> >> >>>>> ... ... ... ... >> >>>>> >> >>>>> CH3LHet 2555491 FALSE dm3 >> >>>>> >> >>>>> CH3RHet 2517507 FALSE dm3 >> >>>>> >> >>>>> CHXHet 204112 FALSE dm3 >> >>>>> >> >>>>> CHYHet 347038 FALSE dm3 >> >>>>> >> >>>>> CHUextra 29004656 FALSE dm3 >> >>>>> >> >>>>> genome(seqinfo(txdb)) >> >>>>>> >> >>>>> >> >>>>> CH2L CH2R CH3L CH3R CH4 CHX >> CHU M >> >>>>> >> >>>>> "dm3" "dm3" "dm3" "dm3" "dm3" "dm3" >> "dm3" "dm3" >> >>>>> >> >>>>> CH2LHet CH2RHet CH3LHet CH3RHet CHXHet CHYHet CHUextra >> >>>>> >> >>>>> "dm3" "dm3" "dm3" "dm3" "dm3" "dm3" "dm3" >> >>>>> >> >>>>> [[alternative HTML version deleted]] >> >>>>> >> >>>>> _______________________________________________ >> >>>>> Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> >> mailing list >> >>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >> >>>>> >> >>>>> >> >>>> -- >> >>>> Herv? Pag?s >> >>>> >> >>>> Program in Computational Biology >> >>>> Division of Public Health Sciences >> >>>> Fred Hutchinson Cancer Research Center >> >>>> 1100 Fairview Ave. N, M1-B514 >> >>>> P.O. Box 19024 >> >>>> Seattle, WA 98109-1024 >> >>>> >> >>>> E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org> >> >>>> Phone: (206) 667-5791 <tel:%28206%29%20667-5791> >> >>>> Fax: (206) 667-1319 <tel:%28206%29%20667-1319> >> >>>> >> >>>> >> >>>> _______________________________________________ >> >>>> Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> >> mailing list >> >>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >> >>>> >> >> >> >> -------------------------------- >> >> Peter Hickey, >> >> PhD Student/Research Assistant, >> >> Bioinformatics Division, >> >> Walter and Eliza Hall Institute of Medical Research, >> >> 1G Royal Parade, Parkville, Vic 3052, Australia. >> >> Ph: +613 9345 2324 <tel:%2B613%209345%202324> >> >> >> >> hic...@wehi.edu.au <mailto:hic...@wehi.edu.au> >> >> http://www.wehi.edu.au >> >> >> >> >> ____________________________________________________________ >> __________ >> >> The information in this email is confidential and >> intend...{{dropped:6}} >> >> >> >> _______________________________________________ >> >>Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing >> list >> >>https://stat.ethz.ch/mailman/listinfo/bioc-devel >> >> >> > >> > -- >> > Hervé Pagès >> > >> > Program in Computational Biology >> > Division of Public Health Sciences >> > Fred Hutchinson Cancer Research Center >> > 1100 Fairview Ave. N, M1-B514 >> > P.O. Box 19024 >> > Seattle, WA 98109-1024 >> > >> > E-mail:hpa...@fhcrc.org <mailto:hpa...@fhcrc.org> >> > Phone:(206) 667-5791 <tel:%28206%29%20667-5791> >> > Fax:(206) 667-1319 <tel:%28206%29%20667-1319> >> >> -------------------------------- >> Peter Hickey, >> PhD Student/Research Assistant, >> Bioinformatics Division, >> Walter and Eliza Hall Institute of Medical Research, >> 1G Royal Parade, Parkville, Vic 3052, Australia. >> Ph: +613 9345 2324 <tel:%2B613%209345%202324> >> >> hic...@wehi.edu.au <mailto:hic...@wehi.edu.au> >> http://www.wehi.edu.au >> >> >> ____________________________________________________________ >> __________ >> The information in this email is confidential and >> intend...{{dropped:8}} >> >> >> _______________________________________________ >> Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing >> list >> https://stat.ethz.ch/mailman/listinfo/bioc-devel >> >> >> > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpa...@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel