[ https://issues.apache.org/jira/browse/KYLIN-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
nichunen resolved KYLIN-4010. ----------------------------- Resolution: Fixed > Auto adjust offset according to query server's timezone for time derived > column > ------------------------------------------------------------------------------- > > Key: KYLIN-4010 > URL: https://issues.apache.org/jira/browse/KYLIN-4010 > Project: Kylin > Issue Type: Improvement > Components: Others > Affects Versions: v3.0.0-alpha > Reporter: zengrui > Assignee: Xiaoxiang Yu > Priority: Minor > Fix For: v3.0.0-beta > > Attachments: image-2019-07-15-17-15-31-209.png, > image-2019-07-15-17-17-04-029.png, image-2019-07-15-17-17-39-568.png, > image-2019-09-22-16-35-23-663.png, image-2019-09-22-16-43-19-248.png, > image-2019-09-22-20-28-02-402.png, image-2019-09-22-20-35-19-038.png, > image-2019-09-22-20-53-34-720.png, image-2019-09-24-17-23-42-451.png, > image-2019-09-24-17-33-29-824.png, image-2019-09-24-17-35-47-589.png, > image-2019-09-24-17-36-39-611.png > > > h2. Backgroud > In realtime OLAP, we index real-time event in streaming receiver. We know > that each event must contains a timestamp column (we often call it event > time), that value should represent when this event was produced. Because > event maybe come from different timezone and use local timezone is always > *error-prone*, so we recommend to use a {color:#DE350B}GMT+0{color} > timestamp(System.currentTimeMillis()) to avoid such issue. > I think this is good by design, it is easy to understand and always correct. > But the *side effect* is that, the end user(business manager behind a BI > tools) are unhappy because he have to use GMT+0 with date/time related filter > in SQL and should understand the result should be *shifted* with his local > timezone. It is not user-firendly and inconvenient for normal user. Because > user may compare query result from different data source and compare them and > summarize, use GMT+0 may trouble them. > h2. Example > For example, kylin user work in *GMT+8* (maybe in Shanghai) want to know some > metrics which occured from {color:#DE350B}2019-09-01 12:00:00{color} to > {color:#DE350B}2019-09-01 14:00:00{color} in his {color:#DE350B}local > timezone{color}, so he has to {color:#DE350B}rewrite{color} his query (with > eight hour offset) to following: > {code:sql} > select hour_start, count(*) > from realtime_table > where hour_start >= "2019-09-01 04:00:00" and hour_start < "2019-09-01 > 06:00:00" > group by hour_start > {code} > And he will get result like : > ||hour_start ||count|| > |2019-09-01 04:00:00 |139202| > |2019-09-01 05:00:00 |89398| > And he must convert to a more meaningful result in his mind, it is realy > annoying! > ||hour_start ||count|| > |2019-09-01 12:00:00 |139202| > |2019-09-01 13:00:00 |89398| > h2. Desgin > We should not change the way receiver index event, event time should be > stored in UTC timestamp. We should auto rewrite sql's event time related > filter. > In kylin, filter condition in where clause will be convert to a > *TupleFilter*, and it looks like *RelNode* in Apache Calicate. > For where hour_start >= "2019-09-01 12:00:00" and hour_start < "2019-09-01 > 14:00:00", we will send TupleFilter to streaming receiver or region server > which looks like this: > {noformat} > AND > GreatThanOrEqual > hout_start > CAST > "2019-09-01 12:00:00" > timestamp > LessThanOrEqual > hout_start > CAST > "2019-09-01 14:00:00" > timestamp > {noformat} > But for streaming query, we want to change each ConstantTupleFilter and minus > value for that timestamp. So the TupleFilter which be sent will be following: > {noformat} > AND > GreatThanOrEqual > hout_start > CAST > "2019-09-01 04:00:00" > timestamp > LessThanOrEqual > hout_start > CAST > "2019-09-01 06:00:00" > timestamp > {noformat} > Before query result processed by *OLAPEnumerator*, kylin will plus each > value of time derived column, thus protect row from be filtered by calcite > generated code. > So, user will get what he want in his timezone without any burden. > h2. How to use > To enable auto shift by time zone, please set > {color:#DE350B}kylin.stream.auto.just.by.timezone{color} to true. > You can specific time zone by {color:#DE350B}kylin.web.timezone{color}, > otherwise, time zone will be auto detected. > Only *time derived column* will be affected. > h2. Related Issue > Originally, the event time can only in the format of a long value (UTC > timestamp). But in some case, the event time is in a format of "yyyy-MM-dd > HH:mm:ss", we use a new class DateTimeParser(introduced in KYLIN-4001) to > convert such format into a UTC timestamp. > h3. Old Describe > In Real-Time Streaming Cube when I send some records to kafka topic, the > tmestamp for the record is 2019-01-01 00:00:00.000, but kylin create a > segment named 20181231160000_20181231170000. > Then I found that TimeZone is hard-coded to "GMT" in function makeSegmentName > for class CubeSegment. I think that it should be config in kylin.properties. -- This message was sent by Atlassian Jira (v8.3.4#803005)