Hello Kyle, This previously answered mailing list covers a bit more than you are asking (and some of which you clearly already know) but should cover your questions:
https://lists.soe.ucsc.edu/pipermail/genome/2009-September/020019.html For a more concise introduction to locType see this previous reply: https://lists.soe.ucsc.edu/pipermail/genome/2006-August/011559.html Note that the second answer unfortunately doesn't say much about the range* classes, which often are red flags for dbSNP's mapping and are now reported as exceptions. Best regards, Pauline Fujita UCSC Genome Bioinformatics Group http://genome.ucsc.edu On 4/18/12 9:30 AM, Chang, Kyle wrote: > Thanks Vanessa. That was really helpful. > > How do I tell if something is insertion or deletion? Its obvious when a > "class" is "deletion" or "insertion". However, it's a bit ambiguous in other > classes. > 1) What about in-del? Are the coordinates represented as deletions or > insertion? > 2) what about non-indel classes, such as microsatelite, mixed and mnp, they > have locTypes of rangeInsertion/rangeDeletion. Do I interpret those > coordinates like deletions and insertions? > > Here is all possible combination of 'class' and 'loctype' in a dbsnp file. > class locType > deletion exact > deletion range > in-del between > in-del exact > in-del range > in-del rangeDeletion > in-del rangeInsertion > in-del rangeSubstitution > insertion between > insertion exact > insertion range > microsatellite between > microsatellite exact > microsatellite range > microsatellite rangeInsertion > mixed between > mixed exact > mixed range > mixed rangeDeletion > mnp between > mnp exact > mnp range > mnp rangeDeletion > mnp rangeInsertion > mnp rangeSubstitution > named between > named exact > named range > named rangeDeletion > named rangeInsertion > single between > single exact > single range > single rangeDeletion > single rangeInsertion > single rangeSubstitution > > Kyle > > On 4/17/12 6:29 PM, "Vanessa Kirkup Swing"<[email protected]> wrote: > > Hi Kyle, > > Please see the answers below: > > > 1) if start position is actually the 1st deleted base or the base before the > start of the deletion. > > With deletions, chromStart is the first deleted base which is not like the > VCF format . > > 2) In the case of insertion, is start position is 1bp before the actual > insertion event. > > For insertions, note that when you add 1 to chromStart, chromStart becomes 1 > greater than chromEnd. So chromStart is the first base after the insertion > and chromEnd is the last base before the insertion. This is kind of > confusing, but in the end, the coordinate math for the length in reference > bases is 0. > > If you have further questions, please contact the list: [email protected]. > > Vanessa Kirkup Swing > UCSC Genome Bioinformatics Group > > > ---------- Forwarded message ---------- > From: Chang, Kyle<[email protected]> > Date: Tue, Apr 17, 2012 at 1:55 PM > Subject: Re: [Genome] How are ucsc dbsnp indel positions reported > To: Vanessa Kirkup Swing<[email protected]> > Cc: "[email protected]"<[email protected]>, "Kakkar, > Nipun"<[email protected]> > > > Right I understand that the positions are all 0-based, and I need to add 1 to > get the same start coordinate as the genome browser. > > What I want to know is 1) if start position is actually the 1st deleted base > or the base before the start of the deletion. > 2) In the case of insertion, is start position is 1bp before the actual > insertion event. > > > Kyle > > > > On 4/17/12 3:48 PM, "Vanessa Kirkup > Swing"<[email protected]<http://[email protected]> > wrote: > > Hi Kyle, > > Please see this FAQ on our coordinate system: > > http://genome.ucsc.edu/FAQ/FAQtracks.html#tracks1 > > > I hope that clarifies things for you. If you have further questions, please > email the list: [email protected]<http://[email protected]> . > > Vanessa Kirkup Swing > UCSC Genome Bioinformatics Group > > > > > ---------- Forwarded message ---------- > From: Chang, Kyle<[email protected]<http://[email protected]> > > Date: Tue, Apr 17, 2012 at 10:02 AM > Subject: [Genome] How are ucsc dbsnp indel positions reported > To: "[email protected]<http://[email protected]> > "<[email protected]<http://[email protected]> > > Cc: "Kakkar, Nipun"<[email protected]<http://[email protected]> > > > > Hi, > > I have a question on how indel coordinates are reported in dbsnp tables. > > Here is a deletion record in dbsnp 135. Is 'start' always the 1st deleted > base or is it like the vcf format which the start is always 1bp before 1st > deleted base? > > E.g. > 131 chr1 61341695 61341699 rs146746778 0 + TTTA TTTA > -/TTTA genomic deletion > > Fo insertion, is 'start' reported as 1bp before insertion? So in this case, > I imagine there's a CA insertion between 92536832 92536833 in 0-base > coordinates. > > E.g. > 161 chr1 92536832 92536832 rs72159935 0 + - - > -/CA genomic insertion > > Best, > Kyle > > > _______________________________________________ > Genome maillist - [email protected]<http://[email protected]> > https://lists.soe.ucsc.edu/mailman/listinfo/genome > > > > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
