Yes, we always truncate timestamp to five minute and cardinality is acceptable, maybe I just change DateStrDictionary to TrieDictionary when building dictionary for timestamp column.
2015-12-15 16:09 GMT+08:00 hongbin ma <[email protected]>: > however it still requires caution if you're using a timestamp column. > Timestamp column may has very high cardinality if you don't apply any > normalization on it. Usually it's suggested to truncate the second or > minute to reduce cardinality. > > On Tue, Dec 15, 2015 at 4:07 PM, hongbin ma <[email protected]> wrote: > > > in 2.x versions, timestamp is being supported > > > > On Tue, Dec 15, 2015 at 4:00 PM, yu feng <[email protected]> wrote: > > > >> Hi All : > >> I build a cube, fact table like this : > >> hive> describe testtimestamp; > >> OK > >> ts timestamp > >> fname string > >> lname string > >> type int > >> cost int > >> > >> I build a cube with dimensions 'ts', 'fname', 'lname' and type, However, > >> after build the cube , I run query like 'select dictinct ts from > >> testtimestamp', and It return : > >> +---------------------+ > >> | TS | > >> +---------------------+ > >> | 2015-12-14 16:00:00 | > >> | 2015-12-12 16:00:00 | > >> | 2015-12-11 16:00:00 | > >> | 2015-12-09 16:00:00 | > >> | 2015-12-10 16:00:00 | > >> | 2015-12-15 16:00:00 | > >> | 2015-12-13 16:00:00 | > >> +---------------------+ > >> > >> then, this is a error result, I query it in hive , it return : > >> 2015-12-10 00:00:00 > >> 2015-12-11 01:02:03 > >> 2015-12-12 05:02:10 > >> 2015-12-12 06:08:10 > >> 2015-12-12 16:02:18 > >> 2015-12-13 06:28:40 > >> 2015-12-14 03:20:15 > >> 2015-12-14 11:04:18 > >> 2015-12-15 10:13:21 > >> 2015-12-16 12:04:12 > >> > >> I know the reason is kylin use DateStrDictionary to build dictionary for > >> column type like ("date")、("time")、("datetime")、("timestamp"); then it > >> will > >> try to use SimpleDateFormat("yyyy-MM-dd") parsing column values, so > after > >> build dictionary, timestamp value in same day transform to the same > value > >> of Date. > >> > >> Is it a bug or some other consideration ? > >> > > > > > > > > -- > > Regards, > > > > *Bin Mahone | 马洪宾* > > Apache Kylin: http://kylin.io > > Github: https://github.com/binmahone > > > > > > -- > Regards, > > *Bin Mahone | 马洪宾* > Apache Kylin: http://kylin.io > Github: https://github.com/binmahone >
