Thank you for the clarification, this is what threw me (the initial mail): > We do have 8 bytes for the TS, though. Not enough to store nanosecs (that would only cover 2^63/10^9/3600/24/365.24 = 292.279 years), but enough for microseconds (292279 years). Should we just store he TS is microseconds? We could do that right now (and just keep the ms resolution for now - i.e. the us part would always be 0 for now).
This says we might want to store a timestamp representation that can handle microsecond resolution. My next step was to wonder about the availability and practicality of microsecond resolution clocks. I don't take Michael's position of "don't". On Wed, Jun 11, 2014 at 10:39 PM, lars hofhansl <la...@apache.org> wrote: > The issues you cite are all orthogonal. We have client/RS time now, we > have clock skew now, that is completely independent from the time > resolution. > > > I explained the need I saw for this before. Lemme include: > > On Fri, May 23, 2014 at 06:16PM, lars hofhansl wrote: > > The specific discussion here was a transaction engine doing snapshot > > isolation using the HBase timestamps, but still be close to wall clock > time > > as much as possible. > > In that scenario, with ms resolution you can only do 1000 > transactions/sec, > > and so you need to turn the timestamp into something that is not wall > clock > > time as HBase understands it (and hence TTL, etc, will no longer work, as > > well as any other tools you've written that use the HBase timestamp). > > 1m transactions/sec are good enough (for now, I envision in a few years > > we'll be sitting here wondering how we could ever think that 1m > > transaction/sec are sufficient) :) > > > > > The point is: Even if you had timestamp oracle (that can resolve ms and > fill inside ms resolution with a counter), there'd be no way to use this as > the HBase timestamp while being close to wall clock (so that TTL, etc, > still works). > So specifically I was not advocating an automatic higher time resolution > (as far as I know that cannot be done reliably in Java across > multiple cores). I was advocating allowing clients with access to a > (perhaps, but not necessarily single threaded) timestamp oracle to store > those timestamps and still make use of all HBase optimization (filtering > HFiles, TTL, etc). > > > -- Lars > > > > ________________________________ > From: Michael Segel <michael_se...@hotmail.com> > To: dev@hbase.apache.org > Cc: lars hofhansl <la...@apache.org> > Sent: Wednesday, June 11, 2014 2:03 PM > Subject: Re: Timestamp resolution > > > Weirdly enough I find that I have to agree with Andrew. > > First, how do you get time in units smaller than a ms? > Second clock skew becomes an issue. > Third, which clock are you using? The client machine? The RS? And then how > do you synchronize each of the RS to be within a ms of each other? > Correct me if I’m wrong but NTP doesn’t give that close of a sync. > > Sorry, but really, not a good idea. > > If you want this… you can store the temporal data as a column. > > Time really is relative. > > > On May 25, 2014, at 12:53 AM, Stack <st...@duboce.net> wrote: > > > On Fri, May 23, 2014 at 5:27 PM, lars hofhansl <la...@apache.org> wrote: > > > >> We have discussed this in the past. It just came up again during an > >> internal discussion. > >> Currently we simply store a Java timestamp (millisec since epoch), i.e. > we > >> have ms resolution. > >> > >> We do have 8 bytes for the TS, though. Not enough to store nanosecs > (that > >> would only cover 2^63/10^9/3600/24/365.24 = 292.279 years), but enough > for > >> microseconds (292279 years). > >> Should we just store he TS is microseconds? We could do that right now > >> (and just keep the ms resolution for now - i.e. the us part would > always be > >> 0 for now). > >> Existing data must be in ms of course, so we'd grandfather that in, but > >> new tables could store by default in us. > >> > >> We'd need to make this configurable both the column family level and > >> client level, so clients could still opt to see data in ms. > >> > >> Comments? Too much to bite off? > >> > >> -- Lars > >> > >> > > I'm a fan. As Enis cites, HBASE-8927 has good discussion. No > > configuration I'd say. Just move to the new regime (though I suppose we > > should let you turn it off). > > > > I think it was Liu Shaohui (IIRC) who made a suggestion that had us put > > together ms and nanos under a synchronized block stamping the ts on Cells > > (left-shift the currentTimeMillis and fill in the bottom bytes with as > much > > of the nanos as fits; i.e. your micros). Rather than nanos/micros, we > > could use a counter instead if a Cell arrives in the same ms. Would be > > costly having all ops go via one code block to get 'time' across cores > and > > handlers. > > > > St.Ack > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)