Re: Timestamps with less than nano precision: rounding vs truncating

Jim Apple Thu, 24 Jan 2019 11:54:03 -0800

You use the phrase "truncation towards negative infinity". Can you give a
concrete example where "truncation towards negative infinity" is different
from "truncation" and another where "truncation towards negative infinity"
is different from "rounding towards negative infinity"? As I mentioned
above, I am not aware of a rounding mode entitled "truncation towards
negative infinity".


On Thu, Jan 24, 2019 at 11:41 AM Csaba Ringhofer <[email protected]>
wrote:

> Thanks for the comments!
>
> > Can you give a concrete example of an query where you are proposing a
> change?
> create table tkudu (id int primary key, t timestamp) stored as kudu;
> insert into tkudu values
>   (1,"1970-01-01 00:00:00.1111111"), -- all sub-second parts are 7 digit
>   (2,"1970-01-01 23:59:59.9999999"),
>   (3,"1969-12-31 23:59:59.9999999");
> select * from tkudu;
>
> This currently returns:
> 1,1970-01-01 00:00:00.111111000
> 2,1970-01-02 00:00:00
> 3,1970-01-01 00:00:00
>
> 1 was rounded down to microsec precision, while 2 and 3 were rounded up and
> also stepped to another way.
> With truncation towards negative infinity the query would return this:
> 1,1970-01-01 00:00:00.111111000
> 2,1970-01-01 23:59:59.999999000
> 3,1969-12-31 23:59:59.999999000
> So 1 would be the same, and 2 and 3 would be truncated from 7 to 6 digits
> and would not step to a new day.
>
> My goal is not to change how Impala writes Kudu, but to get to a consensus
> before going forward with writing Parquet milli/micro timestamps. (
> https://gerrit.cloudera.org/#/c/12247/ )
>
> >That said, Oracle, Netezza, Vertica, and Postgres all round. Db2
> truncates.
> Thanks, good to know!
> So it looks like that rounding is more popular, but consistency with Hive
> may be more important in Impala's case.
>
> On Thu, Jan 24, 2019 at 7:59 PM Greg Rahn <[email protected]> wrote:
>
> > For things like this the ISO SQL spec states  "the choice of whether to
> > round or truncate is implementation-defined".  That said, Oracle,
> Netezza,
> > Vertica, and Postgres all round. Db2 truncates.
> >
> > On Wed, Jan 23, 2019 at 12:26 PM Csaba Ringhofer <
> [email protected]
> > >
> > wrote:
> >
> > > Timestamps are often represented as ticks since some epoch, e.g.
> > 1970.01.01
> > > 00:00:00, so negative timestamps make sense as times before the epoch
> - I
> > > meant rounding vs truncating towards 0 vs rounding towards negative
> > > infinite in this sense. Truncating towards negative infinity means that
> > > timestamps are always truncated to an earlier timestamp. Truncating
> > towards
> > > 0 would mean that before 1970, timestamps are truncated upwards, which
> > can
> > > lead to similar troubles as the I ones mentioned with rounding. On x86
> > c++,
> > > when a time_t is divided by an integer, the result is rounded towards
> 0,
> > so
> > > a naive implementation that uses time_t to represent timestamps can
> > > truncate towards 0, but In impala::TimestampValue, time_ should be the
> > > non-negative nanoseconds since midnight, so it can be simply divided
> with
> > > 1000 to convert from nanoseconds to microseconds.
> > >
> >
>

Re: Timestamps with less than nano precision: rounding vs truncating

Reply via email to