Hi J,

You're right: rs12345 is a regular SNP and not an insertion.  In the 
snp135 table, it is listed as:

chr22    25855458     25855459

which should be interpreted as [25855458-25855459), 0-based.

In the display, on the other hand, the location is represented in our 
"position" format:

chr22:25855459-25855459

which should be interpreted as [25855459-25855459], 1-based.

The coordinate transforms page that Hiram pointed out is really helpful:
http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms

Sorry for the confusion!

--
Brooke Rhead
UCSC Genome Bioinformatics Group


On 3/23/12 1:52 PM, J Ireland wrote:
> Hey Hiram,
>
> Sorry - am I missing something?  rs12345 looks like a simple bi-allelic SNP
> (not an indel) which should have a length of 1.  It's ref allele matches
> the ref genome.  The annotation page also says it has length 1.  I don't
> think this is the perfect case you were looking for....
>
> http://genome.ucsc.edu/cgi-bin/hgc?hgsid=248751473&o=25855458&t=25855459&g=snp135Common&i=rs12345
>
> Yep, I've seen that link.
>
> Thanks again.  Sorry - I never dreamed this would turn into such a marathon
> thread!
>
> -J
>
>
> On Fri, Mar 23, 2012 at 1:34 PM, Hiram Clawson<[email protected]>  wrote:
>
>> Good Afternoon J:
>>
>> This is a perfect case to illuminate the interval notation.
>> Note the interval this item is declared to be in
>>    [25855459-25855459)
>> Note the size of this interval: = end - start = 25855459 - 25855459 = 0
>>
>> The length of this item is ZERO !  It doesn't exist !
>> It can not be found in this sequence.  It is somewhere between
>> actual bases: [25855458-25855459) and [25855459-25855460)
>> but not in this reference sequence.
>>
>> So, in one sense, you could say that the interval: [25855459-25855459)
>> is actually talking about the gap between two bases.  The genome
>> browser does not show gaps at all, there are only bases and they
>> are directly next to each other with no intervening space.  We don't
>> do gaps between bases.  When you enter such a position, we can't
>> display it, we do the next best thing and display one of the bases
>> next to this non-existent item.
>>
>> I'm sure you have already seen our discussion:
>>    http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms
>>
>> -Hiram
>>
>> ----- Original Message -----
>> From: "J Ireland"<[email protected]>
>> To: "Hiram Clawson"<[email protected]>
>> Cc: [email protected]
>> Sent: Thursday, March 22, 2012 11:24:00 PM
>> Subject: Re: [Genome] base vs gap numbering
>>
>> Hey Hiram,
>>
>>
>> I'm definitely a firm believer in the interval, trust me - and I
>> definitely dig the UCSC data tables and the half-open, zero-start intervals.
>>
>>
>> So, rolling with the "everything is an interval" and looking up rs12345...
>>
>>
>> If I zoom into the browser here:
>> http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&position=chr22%3A25855458-25855460
>> I see rs12345 appears to be at [25855458-25855459) given the hash marks.
>> This is a zero-based start and end.
>>
>>
>> Now, I click on rs12345 and see the position as 25855459-25855459 in the
>> annotation page. Should I interpret this as [25855459-25855459), with a
>> one-based start and zero-based end or as [25855459-25855459] a fully closed
>> interval with start and end both one-based?
>>
>>
>> I know you're straddling different count-starting conventions between db
>> tables and the annotation pages. Maybe my misunderstanding is I thought the
>> browser was using the same convention as the annotation pages, but it looks
>> like the browser is actually more inline with the db tables...
>>
>>
>> Thanks for taking this trip with me down the rabbit hole of the obscure
>> and arcane ;)
>> -J
>>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to