I would say rows 1 and 3 are SNVs, but not row 4. For this application I think a variant has to be an SNV or not, as you can't pass half a variant. (I suppose you could remove the ALT values with length > 1 and set those genotypes to missing, but that is both complicated and unexpected behavior. Also, it could introduce bias into association testing, since you would get non-random missingness.)

isSNV() in SeqVarTools returns TRUE only if the max length of all alleles is 1. It also has a logical argument "biallelic" which allows to select for only biallelic SNVs - that could be useful here as well. If biallelic=TRUE, only row 1 would make it into the subset.

Stephanie

On 3/18/14 4:04 PM, Julian Gehring wrote:
Hi Valerie,

I would consider G>C an SNV, G>TT not.  But I assume that there exists
no clear consensus on this.  How about a flag that let's the second pass
as SNV optionally, so everybody can get what one needs?

Best wishes
Julian


On 18/03/14 18:36, Valerie Obenchain wrote:
Hi,

I've added a restrictToSNV() function to VariantAnnotation (1.9.46). The
return value is a subset VCF object containing SNVs only. The function
operates on CollapsedVCF or ExapandedVCF and the alt(VCF) value must be
nucleotides (i.e., no structural variants).

A variant is considered a SNV if the nucleotide sequences in both
ref(vcf) and alt(x) are of length 1. I have a question about how
variants with multiple 'ALT' values should be handled.

Should we consider row 4 a SNV? One 'ALT' is length 1, the other is not.

ALT <- DNAStringSetList("A", c("TT"), c("G", "A"), c("TT", "C"))
REF <- DNAStringSet(c("G", c("AA"), "T", "G"))
DataFrame(REF, ALT)
DataFrame with 4 rows and 2 columns
              REF                ALT
   <DNAStringSet> <DNAStringSetList>
1              G                  A
2             AA                 TT
3              T                G,A
4              G               TT,C


Thanks.
Valerie

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to