Hi J, The 0-based, half-open convention is what we use for numbering in tables. You see it when you deal with tables directly. On the other hand, the position format, and pretty much everything you see in the graphical view on the main page (http://genome.ucsc.edu/cgi-bin/hgTracks) or when you click on an item and go to its details page, is 1-based, fully-closed.
It might be helpful to go the the very beginning of a chromosome, say, chr1:1-10, and see that the first base on the chromosome is labeled "1". As you've noticed, the number labels are not centered over the base labels (the A, T, C, G, or N) at that zoom level . . . the numbers are next to the tick mark that marks the edge of the base. It makes more sense that this is the convention when you zoom out a bit, and see only every 5th or 10th base labeled. -- Brooke Rhead UCSC Genome Bioinformatics Group On 3/23/12 5:09 PM, J Ireland wrote: > Awesome! Thanks, Brooke. That's perfect. > > Last question to put this to rest (this was my past of original > question) - when I zoom way in on the browser (say down to 5 or 10 > bases), am I correct that this is showing me your interval (not > position) format? That's how it appears to me. > > Thanks again, > -J > > > On Fri, Mar 23, 2012 at 4:55 PM, Brooke Rhead <[email protected] > <mailto:[email protected]>> wrote: > > Hi J, > > You're right: rs12345 is a regular SNP and not an insertion. In the > snp135 table, it is listed as: > > chr22 25855458 25855459 > > which should be interpreted as [25855458-25855459), 0-based. > > In the display, on the other hand, the location is represented in > our "position" format: > > chr22:25855459-25855459 > > which should be interpreted as [25855459-25855459], 1-based. > > The coordinate transforms page that Hiram pointed out is really helpful: > http://genomewiki.ucsc.edu/__index.php/Coordinate___Transforms > <http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms> > > Sorry for the confusion! > > -- > Brooke Rhead > UCSC Genome Bioinformatics Group > > > > On 3/23/12 1:52 PM, J Ireland wrote: > > Hey Hiram, > > Sorry - am I missing something? rs12345 looks like a simple > bi-allelic SNP > (not an indel) which should have a length of 1. It's ref allele > matches > the ref genome. The annotation page also says it has length 1. > I don't > think this is the perfect case you were looking for.... > > > http://genome.ucsc.edu/cgi-__bin/hgc?hgsid=248751473&o=__25855458&t=25855459&g=__snp135Common&i=rs12345 > > <http://genome.ucsc.edu/cgi-bin/hgc?hgsid=248751473&o=25855458&t=25855459&g=snp135Common&i=rs12345> > > Yep, I've seen that link. > > Thanks again. Sorry - I never dreamed this would turn into such > a marathon > thread! > > -J > > > On Fri, Mar 23, 2012 at 1:34 PM, Hiram > Clawson<[email protected] <mailto:[email protected]>> wrote: > > Good Afternoon J: > > This is a perfect case to illuminate the interval notation. > Note the interval this item is declared to be in > [25855459-25855459) > Note the size of this interval: = end - start = 25855459 - > 25855459 = 0 > > The length of this item is ZERO ! It doesn't exist ! > It can not be found in this sequence. It is somewhere between > actual bases: [25855458-25855459) and [25855459-25855460) > but not in this reference sequence. > > So, in one sense, you could say that the interval: > [25855459-25855459) > is actually talking about the gap between two bases. The genome > browser does not show gaps at all, there are only bases and they > are directly next to each other with no intervening space. > We don't > do gaps between bases. When you enter such a position, we can't > display it, we do the next best thing and display one of the > bases > next to this non-existent item. > > I'm sure you have already seen our discussion: > http://genomewiki.ucsc.edu/__index.php/Coordinate___Transforms > <http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms> > > -Hiram > > ----- Original Message ----- > From: "J Ireland"<mr.james.ireland@__gmail.com > <mailto:[email protected]>> > To: "Hiram Clawson"<[email protected] > <mailto:[email protected]>> > Cc: [email protected] <mailto:[email protected]> > Sent: Thursday, March 22, 2012 11:24:00 PM > Subject: Re: [Genome] base vs gap numbering > > Hey Hiram, > > > I'm definitely a firm believer in the interval, trust me - and I > definitely dig the UCSC data tables and the half-open, > zero-start intervals. > > > So, rolling with the "everything is an interval" and looking > up rs12345... > > > If I zoom into the browser here: > > http://genome.ucsc.edu/cgi-__bin/hgTracks?db=hg19&position=__chr22%3A25855458-25855460 > > <http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&position=chr22%3A25855458-25855460> > I see rs12345 appears to be at [25855458-25855459) given the > hash marks. > This is a zero-based start and end. > > > Now, I click on rs12345 and see the position as > 25855459-25855459 in the > annotation page. Should I interpret this as > [25855459-25855459), with a > one-based start and zero-based end or as [25855459-25855459] > a fully closed > interval with start and end both one-based? > > > I know you're straddling different count-starting > conventions between db > tables and the annotation pages. Maybe my misunderstanding > is I thought the > browser was using the same convention as the annotation > pages, but it looks > like the browser is actually more inline with the db tables... > > > Thanks for taking this trip with me down the rabbit hole of > the obscure > and arcane ;) > -J > > _________________________________________________ > Genome maillist - [email protected] <mailto:[email protected]> > https://lists.soe.ucsc.edu/__mailman/listinfo/genome > <https://lists.soe.ucsc.edu/mailman/listinfo/genome> > > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
