Thanks Vanessa. That was really helpful. How do I tell if something is insertion or deletion? Its obvious when a "class" is "deletion" or "insertion". However, it's a bit ambiguous in other classes. 1) What about in-del? Are the coordinates represented as deletions or insertion? 2) what about non-indel classes, such as microsatelite, mixed and mnp, they have locTypes of rangeInsertion/rangeDeletion. Do I interpret those coordinates like deletions and insertions?
Here is all possible combination of 'class' and 'loctype' in a dbsnp file. class locType deletion exact deletion range in-del between in-del exact in-del range in-del rangeDeletion in-del rangeInsertion in-del rangeSubstitution insertion between insertion exact insertion range microsatellite between microsatellite exact microsatellite range microsatellite rangeInsertion mixed between mixed exact mixed range mixed rangeDeletion mnp between mnp exact mnp range mnp rangeDeletion mnp rangeInsertion mnp rangeSubstitution named between named exact named range named rangeDeletion named rangeInsertion single between single exact single range single rangeDeletion single rangeInsertion single rangeSubstitution Kyle On 4/17/12 6:29 PM, "Vanessa Kirkup Swing" <[email protected]> wrote: Hi Kyle, Please see the answers below: 1) if start position is actually the 1st deleted base or the base before the start of the deletion. With deletions, chromStart is the first deleted base which is not like the VCF format . 2) In the case of insertion, is start position is 1bp before the actual insertion event. For insertions, note that when you add 1 to chromStart, chromStart becomes 1 greater than chromEnd. So chromStart is the first base after the insertion and chromEnd is the last base before the insertion. This is kind of confusing, but in the end, the coordinate math for the length in reference bases is 0. If you have further questions, please contact the list: [email protected]. Vanessa Kirkup Swing UCSC Genome Bioinformatics Group ---------- Forwarded message ---------- From: Chang, Kyle <[email protected]> Date: Tue, Apr 17, 2012 at 1:55 PM Subject: Re: [Genome] How are ucsc dbsnp indel positions reported To: Vanessa Kirkup Swing <[email protected]> Cc: "[email protected]" <[email protected]>, "Kakkar, Nipun" <[email protected]> Right I understand that the positions are all 0-based, and I need to add 1 to get the same start coordinate as the genome browser. What I want to know is 1) if start position is actually the 1st deleted base or the base before the start of the deletion. 2) In the case of insertion, is start position is 1bp before the actual insertion event. Kyle On 4/17/12 3:48 PM, "Vanessa Kirkup Swing" <[email protected] <http://[email protected]> > wrote: Hi Kyle, Please see this FAQ on our coordinate system: http://genome.ucsc.edu/FAQ/FAQtracks.html#tracks1 I hope that clarifies things for you. If you have further questions, please email the list: [email protected] <http://[email protected]> . Vanessa Kirkup Swing UCSC Genome Bioinformatics Group ---------- Forwarded message ---------- From: Chang, Kyle <[email protected] <http://[email protected]> > Date: Tue, Apr 17, 2012 at 10:02 AM Subject: [Genome] How are ucsc dbsnp indel positions reported To: "[email protected] <http://[email protected]> " <[email protected] <http://[email protected]> > Cc: "Kakkar, Nipun" <[email protected] <http://[email protected]> > Hi, I have a question on how indel coordinates are reported in dbsnp tables. Here is a deletion record in dbsnp 135. Is 'start' always the 1st deleted base or is it like the vcf format which the start is always 1bp before 1st deleted base? E.g. 131 chr1 61341695 61341699 rs146746778 0 + TTTA TTTA -/TTTA genomic deletion Fo insertion, is 'start' reported as 1bp before insertion? So in this case, I imagine there's a CA insertion between 92536832 92536833 in 0-base coordinates. E.g. 161 chr1 92536832 92536832 rs72159935 0 + - - -/CA genomic insertion Best, Kyle _______________________________________________ Genome maillist - [email protected] <http://[email protected]> https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
