Related to this:

I have added getters for seqinfo (and friends) for the OrganismDb objects. I have not added the setters yet though since that requires some refactoring of what an OrganismDb object actually is internally. But I intend to do this also.

  Marc



On 04/25/2013 09:32 AM, Valerie Obenchain wrote:
Hi Vince, Kasper,

cc'ing Herve and Marc.

I think we have a couple of things going on so I wanted to clarify. The 'genome' argument to readVcf() is assigned to the GRanges in rowData with the "genome<-" setter. This is where .normargGenome() gets called.

setReplaceMethod("genome", "Seqinfo",
    function(x, value)
    {
        x@genome <- .normargGenome(value, seqnames(x))
        x
    }
)

If the 'genome' replacement value is named, the name(s) must match the seqnames, not the build. So we aren't talking about matching compatible builds,

fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation")
vcf <- readVcf(fl, c("b37"="hg19"))  ## this is wrong
vcf <- readVcf(fl, c("hg19"="hg19")) ## also wrong

Instead the name must be the seqname, the value is the build,

vcf <- readVcf(fl, c("20"="hg19"))  ## correct
vcf <- readVcf(fl, "hg19")  ## also correct

This requirement for 'genome' is not well documented on ?readVcf or ?Seqinfo. We can fix that.

The second thing is the issue of a flexible mapping between seqinfo metadata for different institutions. Herve and Marc have worked on this in AnnotationDbi. They can explain more about the 'SeqnameStyle' and how it might be used more widely.


Val


On 04/25/2013 06:54 AM, Kasper Daniel Hansen wrote:
An "official" comment on this
   http://genome.ucsc.edu/cgi-bin/hgGateway?db=hg19
with some more info in this discussion

https://groups.google.com/a/soe.ucsc.edu/forum/?fromgroups=#!topic/genome/hFp-dGG9gBs

Essentially it seems the b37 has been "patched" and this patched release is
not reflected in hg19 but may be (I don't know) reflected in the b37
download from NCBI

Kasper


On Thu, Apr 25, 2013 at 9:49 AM, Kasper Daniel Hansen <
kasperdanielhan...@gmail.com> wrote:

I agree with Vincent. I have seen code from Herve in a package with some standardization of chromosome names, and this code could perhaps be used
more widely so we don't have all the problems with chr1 vs chr01 vs 1.

However, in this particular case, if Ulrich is actually interested in the
mitochondrial genome, he has a problem.

hg19, which is the genome version from UCSC is consider equal to NCBIs
b37.  However, as far as I understand, UCSC screwed up with the
mitochondrial genome and used an old version for their hg19. So the error message is in many ways right here: the two genomes are slightly different
because they have different mitochondrial genomes.

Kasper


    [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to