I also prefer to not expand this FLIP further, but we could open a
discussion thread
right after this FLIP being accepted and start coding & reviewing. Make
technique
discussion and coding more pipelined will improve efficiency.

Best,
Kurt


On Sat, Jan 30, 2021 at 3:47 PM Leonard Xu <xbjt...@gmail.com> wrote:

> Hi, Timo
>
> > I do think that this topic must be part of the FLIP as well. Esp. if the
> FLIP has the title "time function behavior" and this is clearly a
> behavioral aspect. We are performing a heavy refactoring of the SQL query
> semantics in Flink here which will affect a lot of users. We cannot rework
> the time functions a third time after this.
> > I checked a couple of other vendors. It seems that they all lock the
> timestamp when the query is started. And as you said, in this case both
> mature (Oracle) and less mature systems (Hive, MySQL) have the same
> behavior.
>
> FLIP-162> “These problems come from the fact that lots of time-related
> functions like PROCTIME(), NOW(), CURRENT_DATE, CURRENT_TIME and
> CURRENT_TIMESTAMP are returning time values based on UTC+0 time zone."
> The motivation of  FLIP-162 is to correct the wrong time-related function
> value which caused by timezone. And after our discussed before, we found
> it's related to the function return type compared to SQL standard and other
> vendors and thus we proposed make the function return type also consistent.
> This is the exact meaning of the FLIP  title and that the FLIP plans to do.
>
> But for the function materialization mechanism, we didn't consider yet as
> a part of our plan because we need to fix the timezone and function type
> issues no matter we modify the function materialization mechanism in the
> future or not.
> So I think it's not belong to this FLIP scope.
>
> It will have been a great work if we can fix current FLIP's 7 proposals
> well, we don't want to expand the scope again Eps it's not part of our
> plan.
>
> What do you think? @Timo
>
> And what’s others' thoughts?  @Jark @Kurt
>
> Best,
> Leonard
>
>
>
>
> > Flink should not differ. I fear that we have to adopt this behavior as
> well to call us standard compliant. Otherwise it will also not be possible
> to have Hive compatibility with proper semantics. It could lead to
> unintended behavior.
> >
> > I see two options for this topic:
> >
> > 1) Clearly distinguish between query-start and processing time
> >
> > MySQL offers NOW() and SYSDATE() to distinguish the two semantics. We
> could run all the previously discussed functions that have a meaning in
> other systems in query-start time and use a different name for processing
> time. `SYS_TIMESTAMP`, `SYS_DATE`, `SYS_TIME`, `SYS_LOCALTIMESTAMP`,
> `SYS_LOCALDATE`, `SYS_LOCALTIME`?
> >
> > 2) Introduce a config option
> >
> > We are non-compliant by default and allow typical batch behavior if
> needed via a config option. But batch/stream unification should not mean
> that we disable certain unification aspects by default.
> >
> > What do you think?
> >
> > Regards,
> > Timo
> >
> > On 28.01.21 16:51, Leonard Xu wrote:
> >> Hi, Timo
> >>> I'm sorry that I need to open another discussion thread befoe voting
> but I think we should also discuss this in this FLIP before it pops up at a
> later stage.
> >>>
> >>> How do we want our time functions to behave in long running queries?
> >> It’s okay to open this thread. Although I don’t want to consider the
> function value materialization in this FLIP scope,  I could try explain
> something.
> >>> See also:
> >>>
> https://stackoverflow.com/questions/5522656/sql-now-in-long-running-query
> >>>
> >>> I think this was never discussed thoroughly. Actually
> CURRENT_TIMESTAMP/NOW/LOCALTIMESTAMP should have slightly different
> semantics than PROCTIME(). What it is our current behavior? Are we
> materializing those time values during planning?
> >> Currently CURRENT_TIMESTAMP/NOW/LOCALTIMESTAMP  keeps same behavior in
> both Batch and Stream world,  the function value is materialized for per
> record not the query start(plan phase).
> >> For  PROCTIME(), it also keeps same behavior  in both Batch and Stream
> world, in fact we just supported PROCTIME() in Batch last week[1].
> >> In one word, we keep same semantics/behavior for Batch and Stream.
> >>> Esp. long running batch queries might suffer from inconsistencies
> here. When a timestamp is produced by one operator using CURRENT_TIMESTAMP
> and a different one might filter relating to CURRENT_TIMESTAMP.
> >> It’s a good question, and I've found some users have asked simillar
> questions in user/user-zh mail-list,  given a fact that many Batch systems
> like Hive/Presto using the value of query start, but it’s not suitable for
> Stream engine, for example user will use CURRENT_TIMESTAMP to define event
> time.
> >> As a unified Batch/Stream SQL engine, keep same semantics/behavior is
> important, and I agree the Batch user case should also be considered.
> >> But I think this should be discussed in another topic like 'the
> unification of Batch/Stream' which is beyond the scope of this FLIP.
> >> This FLIP aims to correct the wrong return type/return value of current
> time functions.
> >> Best,
> >> Leonard
> >> [1] https://issues.apache.org/jira/browse/FLINK-17868 <
> https://issues.apache.org/jira/browse/FLINK-17868> <
> https://issues.apache.org/jira/browse/FLINK-17868 <
> https://issues.apache.org/jira/browse/FLINK-17868>>
> >>> Regards,
> >>> Timo
> >>>
> >>>
> >>> On 28.01.21 13:46, Leonard Xu wrote:
> >>>> Hi, Jark
> >>>>> I have a minor suggestion:
> >>>>> I think we will still suggest users use TIMESTAMP even if we have
> TIMESTAMP_NTZ. Then it seems
> >>>>> introducing TIMESTAMP_NTZ doesn't help much for users, but
> introduces more learning costs.
> >>>> I think your suggestion makes sense, we should suggest users use
> TIMESTAMP for TIMESTAMP WITHOUT TIME ZONE as we did now, updated as
> following:
> >>>>    original type name :
>                       shortcut type name :
> >>>> TIMESTAMP / TIMESTAMP WITHOUT TIME ZONE         <=> TIMESTAMP
> >>>> TIMESTAMP WITH LOCAL TIME ZONE                            <=>
> TIMESTAMP_LTZ
> >>>> TIMESTAMP WITH TIME ZONE                                         <=>
> TIMESTAMP_TZ     (supports them in the future)
> >>>> Best,
> >>>> Leonard
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Thu, 28 Jan 2021 at 18:52, Leonard Xu <xbjt...@gmail.com <mailto:
> xbjt...@gmail.com> <mailto:xbjt...@gmail.com <mailto:xbjt...@gmail.com>>>
> wrote:
> >>>>>
> >>>>>> Thanks all for sharing your opinions.
> >>>>>>
> >>>>>> Looks like  we’ve reached a consensus about the topic.
> >>>>>>
> >>>>>> @Timo:
> >>>>>>> 1) Are we on the same page that LOCALTIMESTAMP returns TIMESTAMP
> and not
> >>>>>> TIMESTAMP_LTZ? Maybe we should quickly list also
> LOCALTIME/LOCALDATE and
> >>>>>> LOCALTIMESTAMP for completeness.
> >>>>>> Yes, LOCALTIMESTAMP returns TIMESTAMP, LOCALTIME returns TIME, the
> >>>>>> behavior of them is clear so I just listed them in the excel[1] of
> this
> >>>>>> FLIP references.
> >>>>>>
> >>>>>>> 2) Shall we add aliases for the timestamp types as part of this
> FLIP? I
> >>>>>> see Snowflake supports TIMESTAMP_LTZ , TIMESTAMP_NTZ , TIMESTAMP_TZ
> [1]. I
> >>>>>> think the discussion was quite cumbersome with the full string of
> >>>>>> `TIMESTAMP WITH LOCAL TIME ZONE`. With this FLIP we are making this
> type
> >>>>>> even more prominent. And important concepts should have a short name
> >>>>>> because they are used frequently. According to the FLIP, we are
> introducing
> >>>>>> the abbriviation already in function names like `TO_TIMESTAMP_LTZ`.
> >>>>>> `TIMESTAMP_LTZ` could be treated similar to `STRING` for
> >>>>>> `VARCHAR(MAX_INT)`, the serializable string representation would
> not change.
> >>>>>>
> >>>>>> @Timo @Jark
> >>>>>> Nice idea, I also suffered from the long name during the
> discussions, the
> >>>>>> abbreviation will not only help us, but also makes it more
> convenient for
> >>>>>> users. I list the abbreviation name mapping to support:
> >>>>>> TIMESTAMP WITHOUT TIME ZONE         <=> TIMESTAMP_NTZ   (which
> synonyms
> >>>>>> TIMESTAMP)
> >>>>>> TIMESTAMP WITH LOCAL TIME ZONE    <=> TIMESTAMP_LTZ
> >>>>>> TIMESTAMP WITH TIME ZONE                 <=> TIMESTAMP_TZ
>  (supports
> >>>>>> them in the future)
> >>>>>>> 3) I'm fine with supporting all conversion classes like
> >>>>>> java.time.LocalDateTime, java.sql.Timestamp that TimestampType
> supported
> >>>>>> for LocalZonedTimestampType. But we agree that Instant stays the
> default
> >>>>>> conversion class right? The default extraction defined in [2] will
> not
> >>>>>> change, correct?
> >>>>>> Yes, Instant stays the default conversion class. The default
> >>>>>>
> >>>>>>> 4) I would remove the comment "Flink supports TIME-related types
> with
> >>>>>> precision well", because unfortunately this is still not correct.
> We still
> >>>>>> have issues with TIME(9), it would be great if someone can finally
> fix that
> >>>>>> though. Maybe the implementation of this FLIP would be a good time
> to fix
> >>>>>> this issue.
> >>>>>> You’re right, TIME(9) is not supported yet, I'll take account of
> TIME(9)
> >>>>>> to the scope of this FLIP.
> >>>>>>
> >>>>>>
> >>>>>> I’ve updated this FLIP[2] according your suggestions @Jark @Timo
> >>>>>> I’ll start the vote soon if there’re no objections.
> >>>>>>
> >>>>>> Best,
> >>>>>> Leonard
> >>>>>>
> >>>>>> [1]
> >>>>>>
> https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing
> >>>>>> <
> >>>>>>
> https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing
> <
> https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing
> >
> >>>>>>>
> >>>>>> [2]
> >>>>>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-162%3A+Consistent+Flink+SQL+time+function+behavior
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-162%3A+Consistent+Flink+SQL+time+function+behavior
> >
> >>>>>> <
> >>>>>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-162:+Consistent+Flink+SQL+time+function+behavior
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-162:+Consistent+Flink+SQL+time+function+behavior
> >>
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> On 28.01.21 03:18, Jark Wu wrote:
> >>>>>>>> Thanks Leonard for the further investigation.
> >>>>>>>> I think we all agree we should correct the return value of
> >>>>>>>> CURRENT_TIMESTAMP.
> >>>>>>>> Regarding the return type of CURRENT_TIMESTAMP, I also agree
> >>>>>> TIMESTAMP_LTZ
> >>>>>>>> would be more worldwide useful. This may need more effort, but if
> this
> >>>>>> is
> >>>>>>>> the right direction, we should do it.
> >>>>>>>> Regarding the CURRENT_TIME, if CURRENT_TIMESTAMP returns
> >>>>>>>> TIMESTAMP_LTZ, then I think CURRENT_TIME shouldn't return TIME_TZ.
> >>>>>>>> Otherwise, CURRENT_TIME will be quite special and strange.
> >>>>>>>> Thus I think it has to return TIME type. Given that we already
> have
> >>>>>>>> CURRENT_DATE which returns
> >>>>>>>> DATE WITHOUT TIME ZONE, I think it's fine to return TIME WITHOUT
> TIME
> >>>>>> ZONE
> >>>>>>>> for CURRENT_TIME.
> >>>>>>>> In a word, the updated FLIP looks good to me. I especially like
> the
> >>>>>>>> proposed new function TO_TIMESTAMP_LTZ(numeric, [,scale]).
> >>>>>>>> This will be very convenient to define rowtime on a long value
> which is
> >>>>>> a
> >>>>>>>> very common case and has been complained a lot in mailing list.
> >>>>>>>> Best,
> >>>>>>>> Jark
> >>>>>>>> On Mon, 25 Jan 2021 at 21:12, Kurt Young <ykt...@gmail.com>
> wrote:
> >>>>>>>>> Thanks Leonard for the detailed response and also the bad case
> about
> >>>>>> option
> >>>>>>>>> 1, these all
> >>>>>>>>> make sense to me.
> >>>>>>>>>
> >>>>>>>>> Also nice catch about conversion support of
> LocalZonedTimestampType, I
> >>>>>>>>> think it actually
> >>>>>>>>> makes sense to support java.sql.Timestamp as well as
> >>>>>>>>> java.time.LocalDateTime. It also has
> >>>>>>>>> a slight benefit that we might have a chance to run the udf
> which took
> >>>>>> them
> >>>>>>>>> as input parameter
> >>>>>>>>> after we change the return type.
> >>>>>>>>>
> >>>>>>>>> Regarding to the return type of CURRENT_TIME, I also think
> timezone
> >>>>>>>>> information is not useful.
> >>>>>>>>> To not expand this FLIP further, I'm lean to keep it as it is.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Kurt
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Jan 25, 2021 at 8:50 PM Leonard Xu <xbjt...@gmail.com>
> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi, All
> >>>>>>>>>>
> >>>>>>>>>> Thanks for your comments. I think all of the thread have agreed
> that:
> >>>>>>>>>> (1) The return values of
> >>>>>> CURRENT_TIME/CURRENT_TIMESTAMP/NOW()/PROCTIME()
> >>>>>>>>>> are wrong.
> >>>>>>>>>> (2) The LOCALTIME/LOCALTIMESTAMP and
> CURRENT_TIME/CURRENT_TIMESTAMP
> >>>>>>>>> should
> >>>>>>>>>> be different whether from SQL standard’s perspective or mature
> >>>>>> systems.
> >>>>>>>>>> (3) The semantics of three TIMESTAMP types in Flink SQL follows
> the
> >>>>>> SQL
> >>>>>>>>>> standard and also keeps the same with other 'good' vendors.
> >>>>>>>>>>    TIMESTAMP                                   =>  A literal in
> >>>>>>>>>> ‘yyyy-MM-dd HH:mm:ss’ format to describe a time, does not
> contain
> >>>>>>>>> timezone
> >>>>>>>>>> info, can not represent an absolute time point.
> >>>>>>>>>>    TIMESTAMP WITH LOCAL ZONE =>  Records the elapsed time from
> >>>>>> absolute
> >>>>>>>>>> time point origin, can represent an absolute time point,
> requires
> >>>>>> local
> >>>>>>>>>> time zone when expressed with ‘yyyy-MM-dd HH:mm:ss’ format.
> >>>>>>>>>>    TIMESTAMP WITH TIME ZONE    =>  Consists of time zone info
> and a
> >>>>>>>>>> literal in ‘yyyy-MM-dd HH:mm:ss’ format to describe time, can
> >>>>>> represent
> >>>>>>>>> an
> >>>>>>>>>> absolute time point.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Currently we've two ways to correct
> >>>>>>>>>> CURRENT_TIME/CURRENT_TIMESTAMP/NOW()/PROCTIME().
> >>>>>>>>>>
> >>>>>>>>>> option (1): As the FLIP proposed, change the return value  from
> UTC
> >>>>>>>>>> timezone to local timezone.
> >>>>>>>>>>        Pros:   (1) The change looks smaller to users and
> developers
> >>>>>> (2)
> >>>>>>>>>> There're many SQL engines adopted this way
> >>>>>>>>>>        Cons:  (1) connector devs may confuse the underlying
> value of
> >>>>>>>>>> TimestampData which needs to change according to data type  (2)
> I
> >>>>>> thought
> >>>>>>>>>> about this weekend. Unfortunately I found a bad case:
> >>>>>>>>>>
> >>>>>>>>>> The proposal is fine if we only use it in FLINK SQL world, but
> we
> >>>>>> need to
> >>>>>>>>>> consider the conversion between Table/DataStream, assume a
> record
> >>>>>>>>> produced
> >>>>>>>>>> in UTC+0 timezone with TIMESTAMP '1970-01-01 08:00:44'  and the
> Flink
> >>>>>> SQL
> >>>>>>>>>> processes the data with session time zone 'UTC+8', if the sql
> program
> >>>>>>>>> need
> >>>>>>>>>> to convert the Table to DataStream, then we need to calculate
> the
> >>>>>>>>> timestamp
> >>>>>>>>>> in StreamRecord with session time zone (UTC+8), then we will
> get 44 in
> >>>>>>>>>> DataStream program, but it is wrong because the expected value
> should
> >>>>>> be
> >>>>>>>>> (8
> >>>>>>>>>> * 60 * 60 + 44). The corner case tell us that the
> ROWTIME/PROCTIME in
> >>>>>>>>> Flink
> >>>>>>>>>> are based on UTC+0, when correct the PROCTIME() function, the
> better
> >>>>>> way
> >>>>>>>>> is
> >>>>>>>>>> to use TIMESTAMP WITH LOCAL TIME ZONE which keeps same long
> value with
> >>>>>>>>> time
> >>>>>>>>>> based on UTC+0 and can be expressed with  local timezone.
> >>>>>>>>>>
> >>>>>>>>>> option (2) : As we considered in the FLIP as well as @Timo
> suggested,
> >>>>>>>>>> change the return type to TIMESTAMP WITH LOCAL TIME ZONE, the
> >>>>>> expressed
> >>>>>>>>>> value depends on the local time zone.
> >>>>>>>>>>        Pros: (1) Make Flink SQL more close to SQL standard  (2)
> Can
> >>>>>> deal
> >>>>>>>>>> the conversion between Table/DataStream well
> >>>>>>>>>>        Cons: (1) We need to discuss the return value/type of
> >>>>>>>>> CURRENT_TIME
> >>>>>>>>>> function (2) The change is bigger to users, we need to support
> >>>>>> TIMESTAMP
> >>>>>>>>>> WITH LOCAL TIME ZONE in connectors/formats as well as custom
> >>>>>> connectors.
> >>>>>>>>>>                   (3)The TIMESTAMP WITH LOCAL TIME ZONE support
> is
> >>>>>> weak
> >>>>>>>>>> in Flink, thus we need some improvement,but the workload does
> not
> >>>>>> matter
> >>>>>>>>>> as long as we are doing the right thing ^_^
> >>>>>>>>>>
> >>>>>>>>>> Due to the above bad case for option (1). I think option 2
> should be
> >>>>>>>>>> adopted,
> >>>>>>>>>> But we also need to consider some problems:
> >>>>>>>>>> (1) More conversion classes like LocalDateTime, sql.Timestamp
> should
> >>>>>> be
> >>>>>>>>>> supported for LocalZonedTimestampType to resolve the UDF
> compatibility
> >>>>>>>>> issue
> >>>>>>>>>> (2) The timezone offset for window size of one day should still
> be
> >>>>>>>>>> considered
> >>>>>>>>>> (3) All connectors/formats should supports TIMESTAMP WITH LOCAL
> TIME
> >>>>>> ZONE
> >>>>>>>>>> well and we also should record in document
> >>>>>>>>>> I’ll update these sections of FLIP-162.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> We also need to discuss the CURRENT_TIME function. I know the
> standard
> >>>>>>>>> way
> >>>>>>>>>> is using TIME WITH TIME ZONE(there's no TIME WITH LOCAL TIME
> ZONE),
> >>>>>> but
> >>>>>>>>> we
> >>>>>>>>>> don't support this type yet and I don't see strong motivation to
> >>>>>> support
> >>>>>>>>> it
> >>>>>>>>>> so far.
> >>>>>>>>>> Compared to CURRENT_TIMESTAMP, the CURRENT_TIME can not
> represent an
> >>>>>>>>>> absolute time point which should be considered as a string
> consisting
> >>>>>> of
> >>>>>>>>> a
> >>>>>>>>>> time with 'HH:mm:ss' format and time zone info.  We have several
> >>>>>> options
> >>>>>>>>>> for this:
> >>>>>>>>>> (1) We can forbid CURRENT_TIME as @Timo proposed to make all
> Flink SQL
> >>>>>>>>>> functions follow the standard well,  in this way, we need to
> offer
> >>>>>> some
> >>>>>>>>>> guidance for user upgrading Flink versions.
> >>>>>>>>>> (2) We can also support it from a user's perspective who has
> used
> >>>>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP, btw,Snowflake also
> >>>>>> returns
> >>>>>>>>>> TIME type.
> >>>>>>>>>> (3) Returns TIMESTAMP WITH LOCAL TIME ZONE to make it equal to
> >>>>>>>>>> CURRENT_TIMESTAMP as Calcite did.
> >>>>>>>>>>
> >>>>>>>>>> I can image (1) which we don't want to left a bad smell in
> Flink SQL,
> >>>>>>>>> and
> >>>>>>>>>> I also accept (2) because I think users do not consider time
> zone
> >>>>>> issues
> >>>>>>>>>> when they use CURRENT_DATE/CURRENT_TIME, and the timezone info
> in
> >>>>>> time is
> >>>>>>>>>> not very useful.
> >>>>>>>>>>
> >>>>>>>>>> I don’t have a strong opinion  for them.  What do others think?
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> I hope I've addressed your concerns. @Timo @Kurt
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Leonard
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> Most of the mature systems have a clear difference between
> >>>>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. I wouldn't take Spark or
> Hive
> >>>>>> as a
> >>>>>>>>>> good example. Snowflake decided for TIMESTAMP WITH LOCAL TIME
> ZONE.
> >>>>>> As I
> >>>>>>>>>> mentioned in the last comment, I could also imagine this
> behavior for
> >>>>>>>>>> Flink. But in any case, there should be some time zone
> information
> >>>>>>>>>> considered in order to cast to all other types.
> >>>>>>>>>>>
> >>>>>>>>>>>>>> The function CURRENT_DATE/CURRENT_TIME is supporting in SQL
> >>>>>>>>>> standard, but
> >>>>>>>>>>>>>> LOCALDATE not, I don’t think it’s a good idea that dropping
> >>>>>>>>>> functions which
> >>>>>>>>>>>>>> SQL standard supported and introducing a replacement which
> SQL
> >>>>>>>>>> standard not
> >>>>>>>>>>>>>> reminded.
> >>>>>>>>>>>
> >>>>>>>>>>> We can still add those functions in the future. But since we
> don't
> >>>>>>>>> offer
> >>>>>>>>>> a TIME WITH TIME ZONE, it is better to not support this
> function at
> >>>>>> all
> >>>>>>>>> for
> >>>>>>>>>> now. And by the way, this is exactly the behavior that also
> Microsoft
> >>>>>> SQL
> >>>>>>>>>> Server does: it also just supports CURRENT_TIMESTAMP (but it
> returns
> >>>>>>>>>> TIMESTAMP without a zone which completes the confusion).
> >>>>>>>>>>>
> >>>>>>>>>>>>>> I also agree returning  TIMESTAMP WITH LOCAL TIME ZONE for
> >>>>>> PROCTIME
> >>>>>>>>>> has
> >>>>>>>>>>>>>> more clear semantics, but I realized that user didn’t care
> the
> >>>>>> type
> >>>>>>>>>> but
> >>>>>>>>>>>>>> more about the expressed value they saw, and change the
> type from
> >>>>>>>>>> TIMESTAMP
> >>>>>>>>>>>>>> to TIMESTAMP WITH LOCAL TIME ZONE brings huge refactor that
> we
> >>>>>> need
> >>>>>>>>>>>>>> consider all places where the TIMESTAMP type used
> >>>>>>>>>>>
> >>>>>>>>>>> From a UDF perspective, I think nothing will change. The new
> type
> >>>>>>>>> system
> >>>>>>>>>> and type inference were designed to support all these cases.
> There is
> >>>>>> a
> >>>>>>>>>> reason why Java has adopted Joda time, because it is hard to
> come up
> >>>>>>>>> with a
> >>>>>>>>>> good time library. That's why also we and the other Hadoop
> ecosystem
> >>>>>>>>> folks
> >>>>>>>>>> have decided for 3 different kinds of LocalDateTime,
> ZonedDateTime,
> >>>>>> and
> >>>>>>>>>> Instance. It makes the library more complex, but time is a
> complex
> >>>>>> topic.
> >>>>>>>>>>>
> >>>>>>>>>>> I also doubt that many users work with only one time zone.
> Take the
> >>>>>> US
> >>>>>>>>>> as an example, a country with 3 different timezones. Somebody
> working
> >>>>>>>>> with
> >>>>>>>>>> US data cannot properly see the data points with just LOCAL
> TIME ZONE.
> >>>>>>>>> But
> >>>>>>>>>> on the other hand, a lot of event data is stored using a UTC
> >>>>>> timestamp.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>> Before jumping into technique details, let's take a step
> back to
> >>>>>>>>>> discuss
> >>>>>>>>>>>>> user experience.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The first important question is what kind of date and time
> will
> >>>>>>>>> Flink
> >>>>>>>>>>>>> display when users call
> >>>>>>>>>>>>>  CURRENT_TIMESTAMP and maybe also PROCTIME (if we think they
> are
> >>>>>>>>>> similar).
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Should it always display the date and time in UTC or in the
> user's
> >>>>>>>>>> time
> >>>>>>>>>>>>> zone?
> >>>>>>>>>>>
> >>>>>>>>>>> @Kurt: I think we all agree that the current behavior with just
> >>>>>> showing
> >>>>>>>>>> UTC is wrong. Also, we all agree that when calling
> CURRENT_TIMESTAMP
> >>>>>> or
> >>>>>>>>>> PROCTIME a user would like to see the time in it's current time
> zone.
> >>>>>>>>>>>
> >>>>>>>>>>> As you said, "my wall clock time".
> >>>>>>>>>>>
> >>>>>>>>>>> However, the question is what is the data type of what you
> "see". If
> >>>>>>>>> you
> >>>>>>>>>> pass this record on to a different system, operator, or
> different
> >>>>>>>>> cluster,
> >>>>>>>>>> should the "my" get lost or materialized into the record?
> >>>>>>>>>>>
> >>>>>>>>>>> TIMESTAMP -> completely lost and could cause confusion in a
> different
> >>>>>>>>>> system
> >>>>>>>>>>>
> >>>>>>>>>>> TIMESTAMP WITH LOCAL TIME ZONE -> at least the UTC is correct,
> so you
> >>>>>>>>>> can provide a new local time zone
> >>>>>>>>>>>
> >>>>>>>>>>> TIMESTAMP WITH TIME ZONE -> also "your" location is persisted
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Timo
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 22.01.21 09:38, Kurt Young wrote:
> >>>>>>>>>>>> Forgot one more thing. Continue with displaying in UTC. As a
> user,
> >>>>>> if
> >>>>>>>>>> Flink
> >>>>>>>>>>>> want to display the timestamp
> >>>>>>>>>>>> in UTC, why don't we offer something like UTC_TIMESTAMP?
> >>>>>>>>>>>> Best,
> >>>>>>>>>>>> Kurt
> >>>>>>>>>>>> On Fri, Jan 22, 2021 at 4:33 PM Kurt Young <ykt...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>>>>>> Before jumping into technique details, let's take a step
> back to
> >>>>>>>>>> discuss
> >>>>>>>>>>>>> user experience.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The first important question is what kind of date and time
> will
> >>>>>> Flink
> >>>>>>>>>>>>> display when users call
> >>>>>>>>>>>>> CURRENT_TIMESTAMP and maybe also PROCTIME (if we think they
> are
> >>>>>>>>>> similar).
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Should it always display the date and time in UTC or in the
> user's
> >>>>>>>>> time
> >>>>>>>>>>>>> zone? I think this part is the
> >>>>>>>>>>>>> reason that surprised lots of users. If we forget about the
> type
> >>>>>> and
> >>>>>>>>>>>>> internal representation of these
> >>>>>>>>>>>>> two methods, as a user, my instinct tells me that these two
> methods
> >>>>>>>>>> should
> >>>>>>>>>>>>> display my wall clock time.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Display time in UTC? I'm not sure, why I should care about
> UTC
> >>>>>> time?
> >>>>>>>>> I
> >>>>>>>>>>>>> want to get my current timestamp.
> >>>>>>>>>>>>> For those users who have never gone abroad, they might not
> even be
> >>>>>>>>>> able to
> >>>>>>>>>>>>> realize that this is affected
> >>>>>>>>>>>>> by the time zone.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Best,
> >>>>>>>>>>>>> Kurt
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Fri, Jan 22, 2021 at 12:25 PM Leonard Xu <
> xbjt...@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks @Timo for the detailed reply, let's go on this topic
> on
> >>>>>> this
> >>>>>>>>>>>>>> discussion,  I've merged all mails to this discussion.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> LOCALDATE / LOCALTIME / LOCALTIMESTAMP
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I'm very sceptical about this behavior. Almost all mature
> systems
> >>>>>>>>>>>>>> (Oracle, Postgres) and new high quality systems (Presto,
> >>>>>> Snowflake)
> >>>>>>>>>> use a
> >>>>>>>>>>>>>> data type with some degree of time zone information
> encoded. In a
> >>>>>>>>>>>>>> globalized world with businesses spanning different
> regions, I
> >>>>>> think
> >>>>>>>>>> we
> >>>>>>>>>>>>>> should do this as well. There should be a difference between
> >>>>>>>>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. And users should be
> able to
> >>>>>>>>>> choose
> >>>>>>>>>>>>>> which behavior they prefer for their pipeline.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I know that the two series should be different at first
> glance,
> >>>>>> but
> >>>>>>>>>>>>>> different SQL engines can have their own explanations,for
> example,
> >>>>>>>>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP are synonyms in
> Snowflake[1]
> >>>>>>>>> and
> >>>>>>>>>> has
> >>>>>>>>>>>>>> no difference, and Spark only supports the later one and
> doesn’t
> >>>>>>>>>> support
> >>>>>>>>>>>>>> LOCALTIME/LOCALTIMESTAMP[2].
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> If we would design this from scatch, I would suggest the
> >>>>>> following:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - drop CURRENT_DATE / CURRENT_TIME and let users pick
> LOCALDATE /
> >>>>>>>>>>>>>> LOCALTIME for materialized timestamp parts
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The function CURRENT_DATE/CURRENT_TIME is supporting in SQL
> >>>>>>>>> standard,
> >>>>>>>>>> but
> >>>>>>>>>>>>>> LOCALDATE not, I don’t think it’s a good idea that dropping
> >>>>>>>>> functions
> >>>>>>>>>> which
> >>>>>>>>>>>>>> SQL standard supported and introducing a replacement which
> SQL
> >>>>>>>>>> standard not
> >>>>>>>>>>>>>> reminded.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - CURRENT_TIMESTAMP should return a TIMESTAMP WITH TIME
> ZONE to
> >>>>>>>>>>>>>> materialize all session time information into every record.
> It it
> >>>>>>>>> the
> >>>>>>>>>> most
> >>>>>>>>>>>>>> generic data type and allows to cast to all other timestamp
> data
> >>>>>>>>>> types.
> >>>>>>>>>>>>>> This generic ability can be used for filter predicates as
> well
> >>>>>>>>> either
> >>>>>>>>>>>>>> through implicit or explicit casting.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> TIMESTAMP WITH TIME ZONE indeed contains more information to
> >>>>>>>>> describe
> >>>>>>>>>> a
> >>>>>>>>>>>>>> time point, but the type TIMESTAMP  can cast to all other
> >>>>>> timestamp
> >>>>>>>>>> data
> >>>>>>>>>>>>>> types combining with session time zone as well, and it also
> can be
> >>>>>>>>>> used for
> >>>>>>>>>>>>>> filter predicates. For type casting between BIGINT and
> TIMESTAMP,
> >>>>>> I
> >>>>>>>>>> think
> >>>>>>>>>>>>>> the function way using TO_TIMEMTAMP()/FROM_UNIXTIMESTAMP()
> is more
> >>>>>>>>>> clear.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> PROCTIME/ROWTIME should be time functions based on a long
> value.
> >>>>>>>>> Both
> >>>>>>>>>>>>>> System.currentMillis() and our watermark system work on long
> >>>>>> values.
> >>>>>>>>>> Those
> >>>>>>>>>>>>>> should return TIMESTAMP WITH LOCAL TIME ZONE because the
> main
> >>>>>>>>>> calculation
> >>>>>>>>>>>>>> should always happen based on UTC.
> >>>>>>>>>>>>>>> We discussed it in a different thread, but we should allow
> >>>>>> PROCTIME
> >>>>>>>>>>>>>> globally. People need a way to create instances of
> TIMESTAMP WITH
> >>>>>>>>>> LOCAL
> >>>>>>>>>>>>>> TIME ZONE. This is not considered in the current design doc.
> >>>>>>>>>>>>>>> Many pipelines contain UTC timestamps and thus it should
> be easy
> >>>>>> to
> >>>>>>>>>>>>>> create one.
> >>>>>>>>>>>>>>> Also, both CURRENT_TIMESTAMP and LOCALTIMESTAMP can work
> with
> >>>>>> this
> >>>>>>>>>> type
> >>>>>>>>>>>>>> because we should remember that TIMESTAMP WITH LOCAL TIME
> ZONE
> >>>>>>>>>> accepts all
> >>>>>>>>>>>>>> timestamp data types as casting target [1]. We could allow
> >>>>>> TIMESTAMP
> >>>>>>>>>> WITH
> >>>>>>>>>>>>>> TIME ZONE in the future for ROWTIME.
> >>>>>>>>>>>>>>> In any case, windows should simply adapt their behavior to
> the
> >>>>>>>>> passed
> >>>>>>>>>>>>>> timestamp type. And with TIMESTAMP WITH LOCAL TIME ZONE a
> day is
> >>>>>>>>>> defined by
> >>>>>>>>>>>>>> considering the current session time zone.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I also agree returning  TIMESTAMP WITH LOCAL TIME ZONE for
> >>>>>> PROCTIME
> >>>>>>>>>> has
> >>>>>>>>>>>>>> more clear semantics, but I realized that user didn’t care
> the
> >>>>>> type
> >>>>>>>>>> but
> >>>>>>>>>>>>>> more about the expressed value they saw, and change the
> type from
> >>>>>>>>>> TIMESTAMP
> >>>>>>>>>>>>>> to TIMESTAMP WITH LOCAL TIME ZONE brings huge refactor that
> we
> >>>>>> need
> >>>>>>>>>>>>>> consider all places where the TIMESTAMP type used, and many
> >>>>>> builtin
> >>>>>>>>>>>>>> functions and UDFs doest not support  TIMESTAMP WITH LOCAL
> TIME
> >>>>>> ZONE
> >>>>>>>>>> type.
> >>>>>>>>>>>>>> That means both user and Flink devs need to refactor the
> code(UDF,
> >>>>>>>>>> builtin
> >>>>>>>>>>>>>> functions, sql pipeline), to be honest, I didn’t see strong
> >>>>>>>>>> motivation that
> >>>>>>>>>>>>>> we have to do the pretty big refactor from user’s
> perspective and
> >>>>>>>>>>>>>> developer’s perspective.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> In one word, both your suggestion and my proposal can
> resolve
> >>>>>> almost
> >>>>>>>>>> all
> >>>>>>>>>>>>>> user problems,the divergence is whether we need to spend
> pretty
> >>>>>>>>>> energy just
> >>>>>>>>>>>>>> to get a bit more accurate semantics?   I think we need a
> >>>>>> tradeoff.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>> Leonard
> >>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>
> https://trino.io/docs/current/functions/datetime.html#current_timestamp
> >>>>>>>>> <
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>
> https://trino.io/docs/current/functions/datetime.html#current_timestamp>
> >>>>>>>>>>>>>> [2] https://issues.apache.org/jira/browse/SPARK-30374 <
> >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/SPARK-30374>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 2021-01-22,00:53,Timo Walther <twal...@apache.org> :
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi Leonard,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> thanks for working on this topic. I agree that time
> handling is
> >>>>>> not
> >>>>>>>>>>>>>> easy in Flink at the moment. We added new time data types
> (and
> >>>>>> some
> >>>>>>>>>> are
> >>>>>>>>>>>>>> still not supported which even further complicates things
> like
> >>>>>>>>>> TIME(9)). We
> >>>>>>>>>>>>>> should definitely improve this situation for users.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> This is a pretty opinionated topic and it seems that the
> SQL
> >>>>>>>>> standard
> >>>>>>>>>>>>>> is not really deciding this but is at least supporting. So
> let me
> >>>>>>>>>> express
> >>>>>>>>>>>>>> my opinion for the most important functions:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> LOCALDATE / LOCALTIME / LOCALTIMESTAMP
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I think those are the most obvious ones because the LOCAL
> >>>>>> indicates
> >>>>>>>>>>>>>> that the locality should be materialized into the result
> and any
> >>>>>>>>> time
> >>>>>>>>>> zone
> >>>>>>>>>>>>>> information (coming from session config or data) is not
> important
> >>>>>>>>>>>>>> afterwards.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I'm very sceptical about this behavior. Almost all mature
> systems
> >>>>>>>>>>>>>> (Oracle, Postgres) and new high quality systems (Presto,
> >>>>>> Snowflake)
> >>>>>>>>>> use a
> >>>>>>>>>>>>>> data type with some degree of time zone information
> encoded. In a
> >>>>>>>>>>>>>> globalized world with businesses spanning different
> regions, I
> >>>>>> think
> >>>>>>>>>> we
> >>>>>>>>>>>>>> should do this as well. There should be a difference between
> >>>>>>>>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. And users should be
> able to
> >>>>>>>>>> choose
> >>>>>>>>>>>>>> which behavior they prefer for their pipeline.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> If we would design this from scatch, I would suggest the
> >>>>>> following:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - drop CURRENT_DATE / CURRENT_TIME and let users pick
> LOCALDATE /
> >>>>>>>>>>>>>> LOCALTIME for materialized timestamp parts
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - CURRENT_TIMESTAMP should return a TIMESTAMP WITH TIME
> ZONE to
> >>>>>>>>>>>>>> materialize all session time information into every record.
> It it
> >>>>>>>>> the
> >>>>>>>>>> most
> >>>>>>>>>>>>>> generic data type and allows to cast to all other timestamp
> data
> >>>>>>>>>> types.
> >>>>>>>>>>>>>> This generic ability can be used for filter predicates as
> well
> >>>>>>>>> either
> >>>>>>>>>>>>>> through implicit or explicit casting.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> PROCTIME/ROWTIME should be time functions based on a long
> value.
> >>>>>>>>> Both
> >>>>>>>>>>>>>> System.currentMillis() and our watermark system work on long
> >>>>>> values.
> >>>>>>>>>> Those
> >>>>>>>>>>>>>> should return TIMESTAMP WITH LOCAL TIME ZONE because the
> main
> >>>>>>>>>> calculation
> >>>>>>>>>>>>>> should always happen based on UTC. We discussed it in a
> different
> >>>>>>>>>> thread,
> >>>>>>>>>>>>>> but we should allow PROCTIME globally. People need a way to
> create
> >>>>>>>>>>>>>> instances of TIMESTAMP WITH LOCAL TIME ZONE. This is not
> >>>>>> considered
> >>>>>>>>>> in the
> >>>>>>>>>>>>>> current design doc. Many pipelines contain UTC timestamps
> and thus
> >>>>>>>>> it
> >>>>>>>>>>>>>> should be easy to create one. Also, both CURRENT_TIMESTAMP
> and
> >>>>>>>>>>>>>> LOCALTIMESTAMP can work with this type because we should
> remember
> >>>>>>>>> that
> >>>>>>>>>>>>>> TIMESTAMP WITH LOCAL TIME ZONE accepts all timestamp data
> types as
> >>>>>>>>>> casting
> >>>>>>>>>>>>>> target [1]. We could allow TIMESTAMP WITH TIME ZONE in the
> future
> >>>>>>>>> for
> >>>>>>>>>>>>>> ROWTIME.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> In any case, windows should simply adapt their behavior to
> the
> >>>>>>>>> passed
> >>>>>>>>>>>>>> timestamp type. And with TIMESTAMP WITH LOCAL TIME ZONE a
> day is
> >>>>>>>>>> defined by
> >>>>>>>>>>>>>> considering the current session time zone.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> If we would like to design this with less effort required,
> we
> >>>>>> could
> >>>>>>>>>>>>>> think about returning TIMESTAMP WITH LOCAL TIME ZONE also
> for
> >>>>>>>>>>>>>> CURRENT_TIMESTAMP.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I will try to involve more people into this discussion.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Data-Types.html#GUID-E7CA339A-2093-4FE4-A36E-1D09593591D3
> >>>>>>>>>>>>>> <
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Data-Types.html#GUID-E7CA339A-2093-4FE4-A36E-1D09593591D3
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 2021-01-21,22:32,Leonard Xu <xbjt...@gmail.com> :
> >>>>>>>>>>>>>>>> Before the changes, as I am writing this reply, the local
> time
> >>>>>>>>> here
> >>>>>>>>>> is
> >>>>>>>>>>>>>> 2021-01-21 12:03:35 (Beijing time, UTC+8).
> >>>>>>>>>>>>>>>> And I tried these 5 functions in sql client, and got:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
> >>>>>>>>>> CURRENT_DATE,
> >>>>>>>>>>>>>> CURRENT_TIME;
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
> >>>>>>>>>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>>>>>>>> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 |
> >>>>>>>>>>>>>> 2021-01-21T04:03:35.228 |   2021-01-21 | 04:03:35.228 |
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>>>>>>>> After the changes, the expected behavior will change to:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
> >>>>>>>>>> CURRENT_DATE,
> >>>>>>>>>>>>>> CURRENT_TIME;
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
> >>>>>>>>>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>>>>>>>> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 |
> >>>>>>>>>>>>>> 2021-01-21T12:03:35.228 |   2021-01-21 | 12:03:35.228 |
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>>>>>>>> The return type of now(), proctime() and
> CURRENT_TIMESTAMP still
> >>>>>>>>> be
> >>>>>>>>>>>>>> TIMESTAMP;
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> To Kurt, thanks  for the intuitive case, it really clear,
> you’re
> >>>>>>>>>> wright
> >>>>>>>>>>>>>> that I want to propose to change the return value of these
> >>>>>>>>> functions.
> >>>>>>>>>> It’s
> >>>>>>>>>>>>>> the most important part of the topic from user's
> perspective.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I think this definitely deserves a FLIP.
> >>>>>>>>>>>>>>> To Jark,  nice suggestion, I prepared a FLIP for this
> topic, and
> >>>>>>>>> will
> >>>>>>>>>>>>>> start the FLIP discussion soon.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> If use the default Flink SQL,&nbsp; the window time
> range of
> >>>>>> the
> >>>>>>>>>>>>>>>>> statistics is incorrect, then the statistical results
> will
> >>>>>>>>>> naturally
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>> incorrect.
> >>>>>>>>>>>>>>> To zhisheng, sorry to hear that this problem influenced
> your
> >>>>>>>>>> production
> >>>>>>>>>>>>>> jobs,  Could you share your SQL pattern?  we can have more
> inputs
> >>>>>>>>> and
> >>>>>>>>>> try
> >>>>>>>>>>>>>> to resolve them.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>> Leonard
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 2021-01-21,14:19,Jark Wu <imj...@gmail.com> :
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Great examples to understand the problem and the proposed
> >>>>>> changes,
> >>>>>>>>>>>>>> @Kurt!
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks Leonard for investigating this problem.
> >>>>>>>>>>>>>>> The time-zone problems around time functions and windows
> have
> >>>>>>>>>> bothered a
> >>>>>>>>>>>>>>> lot of users. It's time to fix them!
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The return value changes sound reasonable to me, and
> keeping the
> >>>>>>>>>> return
> >>>>>>>>>>>>>>> type unchanged will minimize the surprise to the users.
> >>>>>>>>>>>>>>> Besides that, I think it would be better to mention how
> this
> >>>>>>>>> affects
> >>>>>>>>>> the
> >>>>>>>>>>>>>>> window behaviors, and the interoperability with DataStream.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I think this definitely deserves a FLIP.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> ====================================================
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi zhisheng,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Do you have examples to illustrate which case will get the
> wrong
> >>>>>>>>>> window
> >>>>>>>>>>>>>>> boundaries?
> >>>>>>>>>>>>>>> That will help to verify whether the proposed changes can
> solve
> >>>>>>>>> your
> >>>>>>>>>>>>>>> problem.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>> Jark
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 2021-01-21,12:54,zhisheng <173855...@qq.com> :
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks to Leonard Xu for discussing this tricky topic. At
> >>>>>> present,
> >>>>>>>>>>>>>> there are many Flink jobs in our production environment
> that are
> >>>>>>>>> used
> >>>>>>>>>> to
> >>>>>>>>>>>>>> count day-level reports (eg: count PV/UV ).&nbsp;
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> If use the default Flink SQL,&nbsp; the window time range
> of the
> >>>>>>>>>>>>>> statistics is incorrect, then the statistical results will
> >>>>>> naturally
> >>>>>>>>>> be
> >>>>>>>>>>>>>> incorrect.&nbsp;
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The user needs to deal with the time zone manually in
> order to
> >>>>>>>>> solve
> >>>>>>>>>>>>>> the problem.&nbsp;
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> If Flink itself can solve these time zone issues, then I
> think it
> >>>>>>>>>> will
> >>>>>>>>>>>>>> be user-friendly.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thank you
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best!;
> >>>>>>>>>>>>>>> zhisheng
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 2021-01-21,12:11,Kurt Young <ykt...@gmail.com> :
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> cc this to user & user-zh mailing list because this will
> affect
> >>>>>>>>> lots
> >>>>>>>>>> of
> >>>>>>>>>>>>>> users, and also quite a lot of users
> >>>>>>>>>>>>>>> were asking questions around this topic.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Let me try to understand this from user's perspective.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Your proposal will affect five functions, which are:
> >>>>>>>>>>>>>>> PROCTIME()
> >>>>>>>>>>>>>>> NOW()
> >>>>>>>>>>>>>>> CURRENT_DATE
> >>>>>>>>>>>>>>> CURRENT_TIME
> >>>>>>>>>>>>>>> CURRENT_TIMESTAMP
> >>>>>>>>>>>>>>> Before the changes, as I am writing this reply, the local
> time
> >>>>>> here
> >>>>>>>>>> is
> >>>>>>>>>>>>>> 2021-01-21 12:03:35 (Beijing time, UTC+8).
> >>>>>>>>>>>>>>> And I tried these 5 functions in sql client, and got:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
> >>>>>>>>> CURRENT_DATE,
> >>>>>>>>>>>>>> CURRENT_TIME;
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
> >>>>>>>>>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>>>>>>> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 |
> >>>>>>>>>>>>>> 2021-01-21T04:03:35.228 |   2021-01-21 | 04:03:35.228 |
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>>>>>>> After the changes, the expected behavior will change to:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
> >>>>>>>>> CURRENT_DATE,
> >>>>>>>>>>>>>> CURRENT_TIME;
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
> >>>>>>>>>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>>>>>>> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 |
> >>>>>>>>>>>>>> 2021-01-21T12:03:35.228 |   2021-01-21 | 12:03:35.228 |
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>>>>>>> The return type of now(), proctime() and CURRENT_TIMESTAMP
> still
> >>>>>> be
> >>>>>>>>>>>>>> TIMESTAMP;
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>> Kurt
>
>

Reply via email to