> Before the changes, as I am writing this reply, the local time here is 
> 2021-01-21 12:03:35 (Beijing time, UTC+8).
> And I tried these 5 functions in sql client, and got:
> 
> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP, CURRENT_DATE, 
> CURRENT_TIME;
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> |                  EXPR$0 |                  EXPR$1 |       CURRENT_TIMESTAMP 
> | CURRENT_DATE | CURRENT_TIME |
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 
> |   2021-01-21 | 04:03:35.228 |
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> After the changes, the expected behavior will change to:
> 
> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP, CURRENT_DATE, 
> CURRENT_TIME;
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> |                  EXPR$0 |                  EXPR$1 |       CURRENT_TIMESTAMP 
> | CURRENT_DATE | CURRENT_TIME |
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 
> |   2021-01-21 | 12:03:35.228 |
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> The return type of now(), proctime() and CURRENT_TIMESTAMP still be TIMESTAMP;

To Kurt, thanks  for the intuitive case, it really clear, you’re wright that I 
want to propose to change the return value of these functions. It’s the most 
important part of the topic from user's perspective.

> I think this definitely deserves a FLIP.
To Jark,  nice suggestion, I prepared a FLIP for this topic, and will start the 
FLIP discussion soon.

>> If use the default Flink SQL,  the window time range of the
>> statistics is incorrect, then the statistical results will naturally be
>> incorrect.
To zhisheng, sorry to hear that this problem influenced your production jobs,  
Could you share your SQL pattern?  we can have more inputs and try to resolve 
them.


Best,
Leonard



> On Tue, Jan 19, 2021 at 6:42 PM Leonard Xu <xbjt...@gmail.com 
> <mailto:xbjt...@gmail.com>> wrote:
> I found above example format may mess up in different mail client, I post a 
> picture here[1].
> 
> Best,
> Leonard
> 
> [1] 
> https://github.com/leonardBang/flink-sql-etl/blob/master/etl-job/src/main/resources/pictures/CURRRENT_TIMESTAMP.png
>  
> <https://github.com/leonardBang/flink-sql-etl/blob/master/etl-job/src/main/resources/pictures/CURRRENT_TIMESTAMP.png>
>  
> <https://github.com/leonardBang/flink-sql-etl/blob/master/etl-job/src/main/resources/pictures/CURRRENT_TIMESTAMP.png
>  
> <https://github.com/leonardBang/flink-sql-etl/blob/master/etl-job/src/main/resources/pictures/CURRRENT_TIMESTAMP.png>>
>  
> 
> > 在 2021年1月19日,16:22,Leonard Xu <xbjt...@gmail.com 
> > <mailto:xbjt...@gmail.com>> 写道:
> > 
> > Hi, all
> > 
> > I want to start the discussion about correcting time-related function 
> > behavior in Flink SQL, this is a tricky topic but I think it’s time to 
> > address it. 
> > 
> > Currently some temporal function behaviors are wired to users.
> > 1.  When users use a PROCTIME() in SQL, the value of PROCTIME() has a 
> > timezone offset with the wall-clock time in users' local time zone, users 
> > need to add their local time zone offset manually to get expected local 
> > timestamp(e.g: Users in Germany need to +1h to get expected local 
> > timestamp). 
> > 
> > 2. Users can not use CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP  to get 
> > wall-clock timestamp in local time zone, and thus they need write UDF in 
> > their SQL just for implementing a simple filter like WHERE date_col =  
> > CURRENT_DATE. 
> > 
> > 3. Another common case  is the time window  with day interval based on 
> > PROCTIME(), user plan to put all data from one day into the same window, 
> > but the window is assigned using timestamp in UTC+0 timezone rather than 
> > the session timezone which leads to the window starts with an offset(e.g: 
> > Users in China need to add -8h in their business sql start and then +8h 
> > when output the result, the conversion like a magic for users). 
> > 
> > These problems come from that lots of time-related functions like 
> > PROCTIME(), NOW(), CURRENT_DATE, CURRENT_TIME and CURRENT_TIMESTAMP are 
> > returning time values based on UTC+0 time zone.
> > 
> > This topic will lead to a comparison of the three types, i.e. 
> > TIMESTAMP/TIMESTAMP WITHOUT TIME ZONE, TIMESTAMP WITH LOCAL TIME ZONE and 
> > TIMESTAMP WITH TIME ZONE. In order to better understand the three types, I 
> > wrote a document[1] to help understand them better. You can also know the 
> > tree timestamp types behavior in Hadoop ecosystem from the reference link 
> > int the doc.
> > 
> > 
> > I Invested all Flink time-related functions current behavior and compared 
> > with other DB vendors like Pg,Presto, Hive, Spark, Snowflake,  I made an 
> > excel [2] to organize them well, we can use it for the next discussion. 
> > Please let me know if I missed something.
> > From my investigation, I think we need to correct the behavior of function 
> > NOW()/PROCTIME()/CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP, to correct 
> > them, we can change the function return type or function return value or 
> > change return type and return value both. All of those way are valid 
> > because SQL:2011 does not specify the function return type and every SQL 
> > engine vendor has its own implementation. For example the CURRENT_TIMESTAMP 
> > function,
> > 
> > FLINK current behavior        existed problem other vendors' behavior 
> > proposed change
> > CURRENT_TIMESTAMP     CURRENT_TIMESTAMP
> > TIMESTAMP(0) NOT NULL
> > 
> > #session timezone: UTC
> > 2020-12-28T23:52:52
> > 
> > #session timezone: UTC+8
> > 2020-12-28T23:52:52
> > 
> > wall clock:
> > UTC+8: 2020-12-29 07:52:52    Wrong value:returns UTC timestamp, but user 
> > expects current timestamp in session time zone      In MySQL, Spark, the 
> > function NOW() and CURRENT_TIMESTAMP return current timestamp value in 
> > session time zone,the return type is TIMESTAMP
> > 
> > In Pg, Presto, the function NOW() and LOCALTIMESTAMP return current 
> > timestamp in session time zone,the return type is TIMESTAMP WITH TIME ZONE
> > 
> > In Snowflake, the function CURRENT_TIMESTAMP / LOCALTIMESTAMP return 
> > current timestamp in session time zone,the return type is TIMESTAMP WITH 
> > LOCAL TIME ZONE Flink should return current timestamp in session time zone, 
> > the return type should be TIMESTAMP
> > 
> > 
> > I tend to only change the return value for these problematic functions and 
> > introduce an option for compatibility consideration, what do you think?
> > 
> > 
> > Looking forward to your feedback.
> > 
> > Best,
> > Leonard
> > 
> > [1] 
> > https://docs.google.com/document/d/1iY3eatV8LBjmF0gWh2JYrQR0FlTadsSeuCsksOVp_iA/edit?usp=sharing
> >  
> > <https://docs.google.com/document/d/1iY3eatV8LBjmF0gWh2JYrQR0FlTadsSeuCsksOVp_iA/edit?usp=sharing>
> >  
> > <https://docs.google.com/document/d/1iY3eatV8LBjmF0gWh2JYrQR0FlTadsSeuCsksOVp_iA/edit?usp=sharing
> >  
> > <https://docs.google.com/document/d/1iY3eatV8LBjmF0gWh2JYrQR0FlTadsSeuCsksOVp_iA/edit?usp=sharing>>
> >  
> > [2] 
> > https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing
> >  
> > <https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing>
> >  
> > <https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing
> >  
> > <https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing>>
> >  
> 

Reply via email to