Awesome! Thanks, Brooke. That's perfect. Last question to put this to rest (this was my past of original question) - when I zoom way in on the browser (say down to 5 or 10 bases), am I correct that this is showing me your interval (not position) format? That's how it appears to me.
Thanks again, -J On Fri, Mar 23, 2012 at 4:55 PM, Brooke Rhead <[email protected]> wrote: > Hi J, > > You're right: rs12345 is a regular SNP and not an insertion. In the > snp135 table, it is listed as: > > chr22 25855458 25855459 > > which should be interpreted as [25855458-25855459), 0-based. > > In the display, on the other hand, the location is represented in our > "position" format: > > chr22:25855459-25855459 > > which should be interpreted as [25855459-25855459], 1-based. > > The coordinate transforms page that Hiram pointed out is really helpful: > http://genomewiki.ucsc.edu/**index.php/Coordinate_**Transforms<http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms> > > Sorry for the confusion! > > -- > Brooke Rhead > UCSC Genome Bioinformatics Group > > > > On 3/23/12 1:52 PM, J Ireland wrote: > >> Hey Hiram, >> >> Sorry - am I missing something? rs12345 looks like a simple bi-allelic >> SNP >> (not an indel) which should have a length of 1. It's ref allele matches >> the ref genome. The annotation page also says it has length 1. I don't >> think this is the perfect case you were looking for.... >> >> http://genome.ucsc.edu/cgi-**bin/hgc?hgsid=248751473&o=** >> 25855458&t=25855459&g=**snp135Common&i=rs12345<http://genome.ucsc.edu/cgi-bin/hgc?hgsid=248751473&o=25855458&t=25855459&g=snp135Common&i=rs12345> >> >> Yep, I've seen that link. >> >> Thanks again. Sorry - I never dreamed this would turn into such a >> marathon >> thread! >> >> -J >> >> >> On Fri, Mar 23, 2012 at 1:34 PM, Hiram Clawson<[email protected]> >> wrote: >> >> Good Afternoon J: >>> >>> This is a perfect case to illuminate the interval notation. >>> Note the interval this item is declared to be in >>> [25855459-25855459) >>> Note the size of this interval: = end - start = 25855459 - 25855459 = 0 >>> >>> The length of this item is ZERO ! It doesn't exist ! >>> It can not be found in this sequence. It is somewhere between >>> actual bases: [25855458-25855459) and [25855459-25855460) >>> but not in this reference sequence. >>> >>> So, in one sense, you could say that the interval: [25855459-25855459) >>> is actually talking about the gap between two bases. The genome >>> browser does not show gaps at all, there are only bases and they >>> are directly next to each other with no intervening space. We don't >>> do gaps between bases. When you enter such a position, we can't >>> display it, we do the next best thing and display one of the bases >>> next to this non-existent item. >>> >>> I'm sure you have already seen our discussion: >>> >>> http://genomewiki.ucsc.edu/**index.php/Coordinate_**Transforms<http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms> >>> >>> -Hiram >>> >>> ----- Original Message ----- >>> From: "J Ireland"<mr.james.ireland@**gmail.com<[email protected]> >>> > >>> To: "Hiram Clawson"<[email protected]> >>> Cc: [email protected] >>> Sent: Thursday, March 22, 2012 11:24:00 PM >>> Subject: Re: [Genome] base vs gap numbering >>> >>> Hey Hiram, >>> >>> >>> I'm definitely a firm believer in the interval, trust me - and I >>> definitely dig the UCSC data tables and the half-open, zero-start >>> intervals. >>> >>> >>> So, rolling with the "everything is an interval" and looking up >>> rs12345... >>> >>> >>> If I zoom into the browser here: >>> http://genome.ucsc.edu/cgi-**bin/hgTracks?db=hg19&position=** >>> chr22%3A25855458-25855460<http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&position=chr22%3A25855458-25855460> >>> I see rs12345 appears to be at [25855458-25855459) given the hash marks. >>> This is a zero-based start and end. >>> >>> >>> Now, I click on rs12345 and see the position as 25855459-25855459 in the >>> annotation page. Should I interpret this as [25855459-25855459), with a >>> one-based start and zero-based end or as [25855459-25855459] a fully >>> closed >>> interval with start and end both one-based? >>> >>> >>> I know you're straddling different count-starting conventions between db >>> tables and the annotation pages. Maybe my misunderstanding is I thought >>> the >>> browser was using the same convention as the annotation pages, but it >>> looks >>> like the browser is actually more inline with the db tables... >>> >>> >>> Thanks for taking this trip with me down the rabbit hole of the obscure >>> and arcane ;) >>> -J >>> >>> ______________________________**_________________ >> Genome maillist - [email protected] >> https://lists.soe.ucsc.edu/**mailman/listinfo/genome<https://lists.soe.ucsc.edu/mailman/listinfo/genome> >> > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
