Hey Hiram, I'm definitely a firm believer in the interval, trust me - and I definitely dig the UCSC data tables and the half-open, zero-start intervals.
So, rolling with the "everything is an interval" and looking up rs12345... If I zoom into the browser here: http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&position=chr22%3A25855458-25855460 I see rs12345 appears to be at [25855458-25855459) given the hash marks. This is a zero-based start and end. Now, I click on rs12345 and see the position as 25855459-25855459 in the annotation page. Should I interpret this as [25855459-25855459), with a one-based start and zero-based end or as [25855459-25855459] a fully closed interval with start and end both one-based? I know you're straddling different count-starting conventions between db tables and the annotation pages. Maybe my misunderstanding is I thought the browser was using the same convention as the annotation pages, but it looks like the browser is actually more inline with the db tables... Thanks for taking this trip with me down the rabbit hole of the obscure and arcane ;) -J On Thu, Mar 22, 2012 at 10:47 PM, Hiram Clawson <[email protected]> wrote: > Actually J, the UCSC "numbering" system isn't about bases > or the gaps between bases at all, but rather intervals. > For example the first base in a chromosome is the interval [0-1) > The second base in a chromosome is the interval [1-2) > The first and second base is the interval [0-2) > And so forth. We don't apply an actual number to any specific > base or gap between a base. To reference a particular nucleotide > in a chromosome, the UCSC interval needs to be specified. > It depends upon what label number you want to apply to > that first base in the chromosome. Is it base 0 or base 1 ? > > Which came first ... > > --Hiram > > > ----- Original Message ----- > From: "J Ireland" <[email protected]> > To: "Hiram Clawson" <[email protected]> > Cc: [email protected] > Sent: Thursday, March 22, 2012 4:27:00 PM > Subject: Re: [Genome] base vs gap numbering > > Hey Hiram, > > > Nice hearing from you! > > > OK, I don't mean to be a pain but I think I'm still not getting my > question across. Let me give it one more shot, and if I'm still not making > sense maybe it's a discussion to have over beers at ISMB. > > > From everything I've read on the UCSC site, it seems that the UCSC > convention is to number the bases themselves. As you point out, however > "the hash marks are on the "gaps" between the bases" when you zoom in. I > take this to mean you're numbering the "gaps" - not the bases. So, what's > not obvious to me is why the numbered hash marks are not over the bases (at > a zoomed in level of course) if it's the bases that are being numbered. > > > Thanks much, > -J > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
