[Bioc-devel] VariantAnnotation: Same locus, multiple samples

2014-12-05 Thread Julian Gehring
Hi, Assume that we have two variants from two samples at the same locus, stored in a 'VRanges' or 'VCF' object: library(VariantAnnotation) vr = VRanges("1", IRanges(c(10, 10), width = 1), ref = c("C", "C"), alt = c("A", "G"), sampleNames = c("S1", "S2")) vcf = as(vr, "VCF") If we

Re: [Bioc-devel] VariantAnnotation: Same locus, multiple samples

2014-12-05 Thread Michael Lawrence
The two data structures do not encode the same information. Coercion to VCF forms a rectangular matrix: position+alt by sample. There is no standard way to encode that a given cell in that matrix is absent, so coercion to VRanges simply maps each cell to an element. One could imagine using the "."

Re: [Bioc-devel] VariantAnnotation: Same locus, multiple samples

2014-12-08 Thread Michael Lawrence
I don't see how this can be fixed. The two data structures are semantically incompatible; they encode different types of information, so information is lost in both directions. Even if we collapsed the alts, there is no way (as far as I know) to say that data for one individual + alt combination is

Re: [Bioc-devel] VariantAnnotation: Same locus, multiple samples

2014-12-08 Thread Valerie Obenchain
(Resending - the last message didn't post to the list.) I was thinking the absence of a header in VRanges would make collapsing difficult and with your comments it's clear this isn't a good idea. I like the description you gave of the differences in class content and geometry and have added t