Hi J, You're right: rs12345 is a regular SNP and not an insertion. In the snp135 table, it is listed as:
chr22 25855458 25855459 which should be interpreted as [25855458-25855459), 0-based. In the display, on the other hand, the location is represented in our "position" format: chr22:25855459-25855459 which should be interpreted as [25855459-25855459], 1-based. The coordinate transforms page that Hiram pointed out is really helpful: http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms Sorry for the confusion! -- Brooke Rhead UCSC Genome Bioinformatics Group On 3/23/12 1:52 PM, J Ireland wrote: > Hey Hiram, > > Sorry - am I missing something? rs12345 looks like a simple bi-allelic SNP > (not an indel) which should have a length of 1. It's ref allele matches > the ref genome. The annotation page also says it has length 1. I don't > think this is the perfect case you were looking for.... > > http://genome.ucsc.edu/cgi-bin/hgc?hgsid=248751473&o=25855458&t=25855459&g=snp135Common&i=rs12345 > > Yep, I've seen that link. > > Thanks again. Sorry - I never dreamed this would turn into such a marathon > thread! > > -J > > > On Fri, Mar 23, 2012 at 1:34 PM, Hiram Clawson<[email protected]> wrote: > >> Good Afternoon J: >> >> This is a perfect case to illuminate the interval notation. >> Note the interval this item is declared to be in >> [25855459-25855459) >> Note the size of this interval: = end - start = 25855459 - 25855459 = 0 >> >> The length of this item is ZERO ! It doesn't exist ! >> It can not be found in this sequence. It is somewhere between >> actual bases: [25855458-25855459) and [25855459-25855460) >> but not in this reference sequence. >> >> So, in one sense, you could say that the interval: [25855459-25855459) >> is actually talking about the gap between two bases. The genome >> browser does not show gaps at all, there are only bases and they >> are directly next to each other with no intervening space. We don't >> do gaps between bases. When you enter such a position, we can't >> display it, we do the next best thing and display one of the bases >> next to this non-existent item. >> >> I'm sure you have already seen our discussion: >> http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms >> >> -Hiram >> >> ----- Original Message ----- >> From: "J Ireland"<[email protected]> >> To: "Hiram Clawson"<[email protected]> >> Cc: [email protected] >> Sent: Thursday, March 22, 2012 11:24:00 PM >> Subject: Re: [Genome] base vs gap numbering >> >> Hey Hiram, >> >> >> I'm definitely a firm believer in the interval, trust me - and I >> definitely dig the UCSC data tables and the half-open, zero-start intervals. >> >> >> So, rolling with the "everything is an interval" and looking up rs12345... >> >> >> If I zoom into the browser here: >> http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&position=chr22%3A25855458-25855460 >> I see rs12345 appears to be at [25855458-25855459) given the hash marks. >> This is a zero-based start and end. >> >> >> Now, I click on rs12345 and see the position as 25855459-25855459 in the >> annotation page. Should I interpret this as [25855459-25855459), with a >> one-based start and zero-based end or as [25855459-25855459] a fully closed >> interval with start and end both one-based? >> >> >> I know you're straddling different count-starting conventions between db >> tables and the annotation pages. Maybe my misunderstanding is I thought the >> browser was using the same convention as the annotation pages, but it looks >> like the browser is actually more inline with the db tables... >> >> >> Thanks for taking this trip with me down the rabbit hole of the obscure >> and arcane ;) >> -J >> > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
