On Thu, Jun 12, 2014 at 11:30 AM, Michael Segel <michael_se...@hotmail.com> wrote:
> From what I could see Joda time is ms not microseconds. Of course that was > from a couple of years ago. Nothing pop’d out that their website itself. > > Again, when you start to get below the ms timestamp. You need to be a bit > careful on relativity. > Even above the millisecond scale, you can't rely on wall clock measurements directly either. For example, right now, my ntp-synchronized laptop is estimating 3.1ms of error with a max error bound of 139ms. Anyone using the timestamp component alone as a way to determine event ordering is already fooling themself. > > What does that timestamp really mean? > > Lets walk through your example and then we can see why we’re talking past > ourselves. > > Time really is relative. > > If you’re looking at sensor data, then its not a TS but the temporal data > is an element of the record and not the metadata. > I think folks already answered this - in the use case at hand, it's not actually a time measurement, but rather a number generated by a timestamp oracle. -Todd > > > On Jun 12, 2014, at 7:13 PM, Todd Lipcon <t...@cloudera.com> wrote: > > > The OS has it. Here's an implementation from one of my C++ projects: > > > > // Returns the time since the Epoch measured in microseconds. > > inline MicrosecondsInt64 GetCurrentTimeMicros() { > > timespec ts; > > clock_gettime(CLOCK_REALTIME, &ts); > > return ts.tv_sec * 1000000 + ts.tv_nsec / 1000; > > } > > > > Whether it's trivially available from Java, I'm not sure. But I seem to > > recall that JodaTime has it, no? > > > > -Todd > > > > > > On Thu, Jun 12, 2014 at 11:06 AM, Michael Segel < > michael_se...@hotmail.com> > > wrote: > > > >> Silly question. > >> How do you get time in microseconds? > >> > >> > >> On Jun 12, 2014, at 2:56 PM, Andrew Purtell <apurt...@apache.org> > wrote: > >> > >>> Thank you for the clarification, this is what threw me (the initial > >> mail): > >>> > >>>> We do have 8 bytes for the TS, though. Not enough to store nanosecs > >> (that > >>> would only cover 2^63/10^9/3600/24/365.24 = 292.279 years), but enough > >> for > >>> microseconds (292279 years). > >>> Should we just store he TS is microseconds? We could do that right now > >> (and > >>> just keep the ms resolution for now - i.e. the us part would always be > 0 > >>> for now). > >>> > >>> > >>> This says we might want to store a timestamp representation that can > >> handle > >>> microsecond resolution. My next step was to wonder about the > availability > >>> and practicality of microsecond resolution clocks. I don't take > Michael's > >>> position of "don't". > >>> > >>> > >>> > >>> On Wed, Jun 11, 2014 at 10:39 PM, lars hofhansl <la...@apache.org> > >> wrote: > >>> > >>>> The issues you cite are all orthogonal. We have client/RS time now, we > >>>> have clock skew now, that is completely independent from the time > >>>> resolution. > >>>> > >>>> > >>>> I explained the need I saw for this before. Lemme include: > >>>> > >>>> On Fri, May 23, 2014 at 06:16PM, lars hofhansl wrote: > >>>>> The specific discussion here was a transaction engine doing snapshot > >>>>> isolation using the HBase timestamps, but still be close to wall > clock > >>>> time > >>>>> as much as possible. > >>>>> In that scenario, with ms resolution you can only do 1000 > >>>> transactions/sec, > >>>>> and so you need to turn the timestamp into something that is not wall > >>>> clock > >>>>> time as HBase understands it (and hence TTL, etc, will no longer > work, > >> as > >>>>> well as any other tools you've written that use the HBase timestamp). > >>>>> 1m transactions/sec are good enough (for now, I envision in a few > years > >>>>> we'll be sitting here wondering how we could ever think that 1m > >>>>> transaction/sec are sufficient) :) > >>>>> > >>>> > >>>> > >>>> The point is: Even if you had timestamp oracle (that can resolve ms > and > >>>> fill inside ms resolution with a counter), there'd be no way to use > >> this as > >>>> the HBase timestamp while being close to wall clock (so that TTL, etc, > >>>> still works). > >>>> So specifically I was not advocating an automatic higher time > resolution > >>>> (as far as I know that cannot be done reliably in Java across > >>>> multiple cores). I was advocating allowing clients with access to a > >>>> (perhaps, but not necessarily single threaded) timestamp oracle to > store > >>>> those timestamps and still make use of all HBase optimization > (filtering > >>>> HFiles, TTL, etc). > >>>> > >>>> > >>>> -- Lars > >>>> > >>>> > >>>> > >>>> ________________________________ > >>>> From: Michael Segel <michael_se...@hotmail.com> > >>>> To: dev@hbase.apache.org > >>>> Cc: lars hofhansl <la...@apache.org> > >>>> Sent: Wednesday, June 11, 2014 2:03 PM > >>>> Subject: Re: Timestamp resolution > >>>> > >>>> > >>>> Weirdly enough I find that I have to agree with Andrew. > >>>> > >>>> First, how do you get time in units smaller than a ms? > >>>> Second clock skew becomes an issue. > >>>> Third, which clock are you using? The client machine? The RS? And then > >> how > >>>> do you synchronize each of the RS to be within a ms of each other? > >>>> Correct me if I’m wrong but NTP doesn’t give that close of a sync. > >>>> > >>>> Sorry, but really, not a good idea. > >>>> > >>>> If you want this… you can store the temporal data as a column. > >>>> > >>>> Time really is relative. > >>>> > >>>> > >>>> On May 25, 2014, at 12:53 AM, Stack <st...@duboce.net> wrote: > >>>> > >>>>> On Fri, May 23, 2014 at 5:27 PM, lars hofhansl <la...@apache.org> > >> wrote: > >>>>> > >>>>>> We have discussed this in the past. It just came up again during an > >>>>>> internal discussion. > >>>>>> Currently we simply store a Java timestamp (millisec since epoch), > >> i.e. > >>>> we > >>>>>> have ms resolution. > >>>>>> > >>>>>> We do have 8 bytes for the TS, though. Not enough to store nanosecs > >>>> (that > >>>>>> would only cover 2^63/10^9/3600/24/365.24 = 292.279 years), but > enough > >>>> for > >>>>>> microseconds (292279 years). > >>>>>> Should we just store he TS is microseconds? We could do that right > now > >>>>>> (and just keep the ms resolution for now - i.e. the us part would > >>>> always be > >>>>>> 0 for now). > >>>>>> Existing data must be in ms of course, so we'd grandfather that in, > >> but > >>>>>> new tables could store by default in us. > >>>>>> > >>>>>> We'd need to make this configurable both the column family level and > >>>>>> client level, so clients could still opt to see data in ms. > >>>>>> > >>>>>> Comments? Too much to bite off? > >>>>>> > >>>>>> -- Lars > >>>>>> > >>>>>> > >>>>> I'm a fan. As Enis cites, HBASE-8927 has good discussion. No > >>>>> configuration I'd say. Just move to the new regime (though I suppose > >> we > >>>>> should let you turn it off). > >>>>> > >>>>> I think it was Liu Shaohui (IIRC) who made a suggestion that had us > put > >>>>> together ms and nanos under a synchronized block stamping the ts on > >> Cells > >>>>> (left-shift the currentTimeMillis and fill in the bottom bytes with > as > >>>> much > >>>>> of the nanos as fits; i.e. your micros). Rather than nanos/micros, > we > >>>>> could use a counter instead if a Cell arrives in the same ms. Would > be > >>>>> costly having all ops go via one code block to get 'time' across > cores > >>>> and > >>>>> handlers. > >>>>> > >>>>> St.Ack > >>>> > >>> > >>> > >>> > >>> -- > >>> Best regards, > >>> > >>> - Andy > >>> > >>> Problems worthy of attack prove their worth by hitting back. - Piet > Hein > >>> (via Tom White) > >> > >> > > > > > > -- > > Todd Lipcon > > Software Engineer, Cloudera > > -- Todd Lipcon Software Engineer, Cloudera