I typically get this info from Homo.sapiens. The result is parasitic on the TxDb that is in there. I don't know how easy it is to swap alternate TxDb in to get a different build. I think it would make sense to regard the OrganismDb instances as foundational for this sort of structural data.
On Wed, Jun 3, 2015 at 3:12 PM, Kasper Daniel Hansen < kasperdanielhan...@gmail.com> wrote: > Let me rephrase this slightly. From one POV the purpose of GenomeInfoDb is > clean up the seqinfo slot. Currently it does most of the cleaning, but it > does not add seqlengths. > > It is clear that seqlengths depends on the version of the genome, but I > will argue so does the seqnames. Of course, for human, chr22 will not > change. But what about the names of all the random contigs? Or for other > organisms, what about going from a draft genome with 10k contigs to a more > completely genome assembled into fewer, larger chromosomes. > > I acknowledge that this information is present in the BSgenome packages, > but it seems (to me) to be very appropriate to have them around for > cleaning up the seqinfo slot. For some situations it is not great to > depend on 1 GB> download for something that is a few bytes. > > Best, > Kasper > > On Wed, Jun 3, 2015 at 3:00 PM, Tim Triche, Jr. <tim.tri...@gmail.com> > wrote: > > > It would be nice (for a number of reasons) to have chromosome lengths > > readily available in a foundational package like GenomeInfoDb, so that, > > say, > > > > data(seqinfo.hg19) > > seqinfo(myResults) <- seqinfo.hg19[ seqlevels(myResults) ] > > > > would work without issues. Is there any particular reason this couldn't > > happen for the supported/available BSgenomes? It would seem like a > simple > > matter to do > > > > R> library(BSgenome.Hsapiens.UCSC.hg19) > > R> seqinfo.hg19 <- seqinfo(Hsapiens) > > R> save(seqinfo.hg19, > > file="~/bioc-devel/GenomeInfoDb/data/seqinfo.hg19.rda") > > > > and be done with it until (say) the next release or next released > > BSgenome. I considered looping through the following BSgenomes myself... > > and if it isn't strongly opposed by (everyone) I may still do exactly > > that. Seems useful, no? > > > > e.g. for the following 42 builds, > > > > grep("(UCSC|NCBI)", unique(gsub(".masked", "", available.genomes())), > > value=TRUE) > > [1] "BSgenome.Amellifera.UCSC.apiMel2" "BSgenome.Btaurus.UCSC.bosTau3" > > > > [3] "BSgenome.Btaurus.UCSC.bosTau4" "BSgenome.Btaurus.UCSC.bosTau6" > > > > [5] "BSgenome.Btaurus.UCSC.bosTau8" "BSgenome.Celegans.UCSC.ce10" > > > > [7] "BSgenome.Celegans.UCSC.ce2" "BSgenome.Celegans.UCSC.ce6" > > > > [9] "BSgenome.Cfamiliaris.UCSC.canFam2" > > "BSgenome.Cfamiliaris.UCSC.canFam3" > > [11] "BSgenome.Dmelanogaster.UCSC.dm2" > > "BSgenome.Dmelanogaster.UCSC.dm3" > > [13] "BSgenome.Dmelanogaster.UCSC.dm6" "BSgenome.Drerio.UCSC.danRer5" > > > > [15] "BSgenome.Drerio.UCSC.danRer6" "BSgenome.Drerio.UCSC.danRer7" > > > > [17] "BSgenome.Ecoli.NCBI.20080805" > > "BSgenome.Gaculeatus.UCSC.gasAcu1" > > [19] "BSgenome.Ggallus.UCSC.galGal3" "BSgenome.Ggallus.UCSC.galGal4" > > > > [21] "BSgenome.Hsapiens.NCBI.GRCh38" "BSgenome.Hsapiens.UCSC.hg17" > > > > [23] "BSgenome.Hsapiens.UCSC.hg18" "BSgenome.Hsapiens.UCSC.hg19" > > > > [25] "BSgenome.Hsapiens.UCSC.hg38" > > "BSgenome.Mfascicularis.NCBI.5.0" > > [27] "BSgenome.Mfuro.UCSC.musFur1" > "BSgenome.Mmulatta.UCSC.rheMac2" > > > > [29] "BSgenome.Mmulatta.UCSC.rheMac3" "BSgenome.Mmusculus.UCSC.mm10" > > > > [31] "BSgenome.Mmusculus.UCSC.mm8" "BSgenome.Mmusculus.UCSC.mm9" > > > > [33] "BSgenome.Ptroglodytes.UCSC.panTro2" > > "BSgenome.Ptroglodytes.UCSC.panTro3" > > [35] "BSgenome.Rnorvegicus.UCSC.rn4" "BSgenome.Rnorvegicus.UCSC.rn5" > > > > [37] "BSgenome.Rnorvegicus.UCSC.rn6" > > "BSgenome.Scerevisiae.UCSC.sacCer1" > > [39] "BSgenome.Scerevisiae.UCSC.sacCer2" > > "BSgenome.Scerevisiae.UCSC.sacCer3" > > [41] "BSgenome.Sscrofa.UCSC.susScr3" > "BSgenome.Tguttata.UCSC.taeGut1" > > > > > > > > > > Am I insane for suggesting this? It would make things a little easier > for > > rtracklayer, most SummarizedExperiment and SE-derived objects, blah, > blah, > > blah... > > > > > > Best, > > > > --t > > > > > > > > > > Statistics is the grammar of science. > > Karl Pearson <http://en.wikipedia.org/wiki/The_Grammar_of_Science> > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel