Yes, we always truncate timestamp to five minute and cardinality is acceptable,
maybe I just change DateStrDictionary to TrieDictionary when building
dictionary for timestamp column.

2015-12-15 16:09 GMT+08:00 hongbin ma <[email protected]>:

> however it still requires caution if you're using a timestamp column.
> Timestamp column may has very high cardinality if you don't apply any
> normalization on it. Usually it's suggested to truncate the second or
> minute to reduce cardinality.
>
> On Tue, Dec 15, 2015 at 4:07 PM, hongbin ma <[email protected]> wrote:
>
> > in 2.x versions, timestamp is being supported
> >
> > On Tue, Dec 15, 2015 at 4:00 PM, yu feng <[email protected]> wrote:
> >
> >> Hi All :
> >>     I build a cube, fact table like this :
> >> hive> describe testtimestamp;
> >> OK
> >> ts                   timestamp
> >> fname               string
> >> lname               string
> >> type                 int
> >> cost                 int
> >>
> >> I build a cube with dimensions 'ts', 'fname', 'lname' and type, However,
> >> after build the cube , I run query like 'select dictinct ts from
> >> testtimestamp', and It return :
> >> +---------------------+
> >> |         TS          |
> >> +---------------------+
> >> | 2015-12-14 16:00:00 |
> >> | 2015-12-12 16:00:00 |
> >> | 2015-12-11 16:00:00 |
> >> | 2015-12-09 16:00:00 |
> >> | 2015-12-10 16:00:00 |
> >> | 2015-12-15 16:00:00 |
> >> | 2015-12-13 16:00:00 |
> >> +---------------------+
> >>
> >> then, this is a error result, I query it in hive , it return :
> >> 2015-12-10 00:00:00
> >> 2015-12-11 01:02:03
> >> 2015-12-12 05:02:10
> >> 2015-12-12 06:08:10
> >> 2015-12-12 16:02:18
> >> 2015-12-13 06:28:40
> >> 2015-12-14 03:20:15
> >> 2015-12-14 11:04:18
> >> 2015-12-15 10:13:21
> >> 2015-12-16 12:04:12
> >>
> >> I know the reason is kylin use DateStrDictionary to build dictionary for
> >> column type like ("date")、("time")、("datetime")、("timestamp"); then it
> >> will
> >> try to use SimpleDateFormat("yyyy-MM-dd")  parsing column values, so
> after
> >> build dictionary, timestamp value in same day transform to the same
> value
> >> of Date.
> >>
> >> Is it a bug or some other consideration ?
> >>
> >
> >
> >
> > --
> > Regards,
> >
> > *Bin Mahone | 马洪宾*
> > Apache Kylin: http://kylin.io
> > Github: https://github.com/binmahone
> >
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>

Reply via email to