I have a minor suggestion:

I think we may not need to introduce TIMESTAMP_NTZ, we already have the
shortcut
type TIMESTAMP for TIMESTAMP WITHOUT TIME ZONE. I think we will still
suggest
 users use TIMESTAMP even if we have TIMESTAMP_NTZ. Then it seems
introducing
TIMESTAMP_NTZ doesn't help much for users, but introduces more learning
costs.

Best,
Jark





On Thu, 28 Jan 2021 at 18:52, Leonard Xu <xbjt...@gmail.com> wrote:

> Thanks all for sharing your opinions.
>
> Looks like  we’ve reached a consensus about the topic.
>
> @Timo:
> > 1) Are we on the same page that LOCALTIMESTAMP returns TIMESTAMP and not
> TIMESTAMP_LTZ? Maybe we should quickly list also LOCALTIME/LOCALDATE and
> LOCALTIMESTAMP for completeness.
> Yes, LOCALTIMESTAMP returns TIMESTAMP, LOCALTIME returns TIME, the
> behavior of them is clear so I just listed them in the excel[1] of this
> FLIP references.
>
> > 2) Shall we add aliases for the timestamp types as part of this FLIP? I
> see Snowflake supports TIMESTAMP_LTZ , TIMESTAMP_NTZ , TIMESTAMP_TZ [1]. I
> think the discussion was quite cumbersome with the full string of
> `TIMESTAMP WITH LOCAL TIME ZONE`. With this FLIP we are making this type
> even more prominent. And important concepts should have a short name
> because they are used frequently. According to the FLIP, we are introducing
> the abbriviation already in function names like `TO_TIMESTAMP_LTZ`.
> `TIMESTAMP_LTZ` could be treated similar to `STRING` for
> `VARCHAR(MAX_INT)`, the serializable string representation would not change.
>
> @Timo @Jark
> Nice idea, I also suffered from the long name during the discussions, the
> abbreviation will not only help us, but also makes it more convenient for
> users. I list the abbreviation name mapping to support:
> TIMESTAMP WITHOUT TIME ZONE         <=> TIMESTAMP_NTZ   (which synonyms
> TIMESTAMP)
> TIMESTAMP WITH LOCAL TIME ZONE    <=> TIMESTAMP_LTZ
> TIMESTAMP WITH TIME ZONE                 <=> TIMESTAMP_TZ     (supports
> them in the future)
> > 3) I'm fine with supporting all conversion classes like
> java.time.LocalDateTime, java.sql.Timestamp that TimestampType supported
> for LocalZonedTimestampType. But we agree that Instant stays the default
> conversion class right? The default extraction defined in [2] will not
> change, correct?
> Yes, Instant stays the default conversion class. The default
>
> > 4) I would remove the comment "Flink supports TIME-related types with
> precision well", because unfortunately this is still not correct. We still
> have issues with TIME(9), it would be great if someone can finally fix that
> though. Maybe the implementation of this FLIP would be a good time to fix
> this issue.
> You’re right, TIME(9) is not supported yet, I'll take account of TIME(9)
> to the scope of this FLIP.
>
>
> I’ve updated this FLIP[2] according your suggestions @Jark @Timo
> I’ll start the vote soon if there’re no objections.
>
> Best,
> Leonard
>
> [1]
> https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing
> <
> https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing
> >
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-162%3A+Consistent+Flink+SQL+time+function+behavior
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-162:+Consistent+Flink+SQL+time+function+behavior>
>
>
> >
> > On 28.01.21 03:18, Jark Wu wrote:
> >> Thanks Leonard for the further investigation.
> >> I think we all agree we should correct the return value of
> >> CURRENT_TIMESTAMP.
> >> Regarding the return type of CURRENT_TIMESTAMP, I also agree
> TIMESTAMP_LTZ
> >> would be more worldwide useful. This may need more effort, but if this
> is
> >> the right direction, we should do it.
> >> Regarding the CURRENT_TIME, if CURRENT_TIMESTAMP returns
> >>  TIMESTAMP_LTZ, then I think CURRENT_TIME shouldn't return TIME_TZ.
> >> Otherwise, CURRENT_TIME will be quite special and strange.
> >> Thus I think it has to return TIME type. Given that we already have
> >> CURRENT_DATE which returns
> >>  DATE WITHOUT TIME ZONE, I think it's fine to return TIME WITHOUT TIME
> ZONE
> >> for CURRENT_TIME.
> >> In a word, the updated FLIP looks good to me. I especially like the
> >> proposed new function TO_TIMESTAMP_LTZ(numeric, [,scale]).
> >> This will be very convenient to define rowtime on a long value which is
> a
> >> very common case and has been complained a lot in mailing list.
> >> Best,
> >> Jark
> >> On Mon, 25 Jan 2021 at 21:12, Kurt Young <ykt...@gmail.com> wrote:
> >>> Thanks Leonard for the detailed response and also the bad case about
> option
> >>> 1, these all
> >>> make sense to me.
> >>>
> >>> Also nice catch about conversion support of LocalZonedTimestampType, I
> >>> think it actually
> >>> makes sense to support java.sql.Timestamp as well as
> >>> java.time.LocalDateTime. It also has
> >>> a slight benefit that we might have a chance to run the udf which took
> them
> >>> as input parameter
> >>> after we change the return type.
> >>>
> >>> Regarding to the return type of CURRENT_TIME, I also think timezone
> >>> information is not useful.
> >>> To not expand this FLIP further, I'm lean to keep it as it is.
> >>>
> >>> Best,
> >>> Kurt
> >>>
> >>>
> >>> On Mon, Jan 25, 2021 at 8:50 PM Leonard Xu <xbjt...@gmail.com> wrote:
> >>>
> >>>> Hi, All
> >>>>
> >>>>  Thanks for your comments. I think all of the thread have agreed that:
> >>>> (1) The return values of
> CURRENT_TIME/CURRENT_TIMESTAMP/NOW()/PROCTIME()
> >>>> are wrong.
> >>>> (2) The LOCALTIME/LOCALTIMESTAMP and CURRENT_TIME/CURRENT_TIMESTAMP
> >>> should
> >>>> be different whether from SQL standard’s perspective or mature
> systems.
> >>>> (3) The semantics of three TIMESTAMP types in Flink SQL follows the
> SQL
> >>>> standard and also keeps the same with other 'good' vendors.
> >>>>     TIMESTAMP                                   =>  A literal in
> >>>> ‘yyyy-MM-dd HH:mm:ss’ format to describe a time, does not contain
> >>> timezone
> >>>> info, can not represent an absolute time point.
> >>>>     TIMESTAMP WITH LOCAL ZONE =>  Records the elapsed time from
> absolute
> >>>> time point origin, can represent an absolute time point, requires
> local
> >>>> time zone when expressed with ‘yyyy-MM-dd HH:mm:ss’ format.
> >>>>     TIMESTAMP WITH TIME ZONE    =>  Consists of time zone info and a
> >>>> literal in ‘yyyy-MM-dd HH:mm:ss’ format to describe time, can
> represent
> >>> an
> >>>> absolute time point.
> >>>>
> >>>>
> >>>> Currently we've two ways to correct
> >>>> CURRENT_TIME/CURRENT_TIMESTAMP/NOW()/PROCTIME().
> >>>>
> >>>> option (1): As the FLIP proposed, change the return value  from UTC
> >>>> timezone to local timezone.
> >>>>         Pros:   (1) The change looks smaller to users and developers
> (2)
> >>>> There're many SQL engines adopted this way
> >>>>         Cons:  (1) connector devs may confuse the underlying value of
> >>>> TimestampData which needs to change according to data type  (2) I
> thought
> >>>> about this weekend. Unfortunately I found a bad case:
> >>>>
> >>>> The proposal is fine if we only use it in FLINK SQL world, but we
> need to
> >>>> consider the conversion between Table/DataStream, assume a record
> >>> produced
> >>>> in UTC+0 timezone with TIMESTAMP '1970-01-01 08:00:44'  and the Flink
> SQL
> >>>> processes the data with session time zone 'UTC+8', if the sql program
> >>> need
> >>>> to convert the Table to DataStream, then we need to calculate the
> >>> timestamp
> >>>> in StreamRecord with session time zone (UTC+8), then we will get 44 in
> >>>> DataStream program, but it is wrong because the expected value should
> be
> >>> (8
> >>>> * 60 * 60 + 44). The corner case tell us that the ROWTIME/PROCTIME in
> >>> Flink
> >>>> are based on UTC+0, when correct the PROCTIME() function, the better
> way
> >>> is
> >>>> to use TIMESTAMP WITH LOCAL TIME ZONE which keeps same long value with
> >>> time
> >>>> based on UTC+0 and can be expressed with  local timezone.
> >>>>
> >>>> option (2) : As we considered in the FLIP as well as @Timo suggested,
> >>>> change the return type to TIMESTAMP WITH LOCAL TIME ZONE, the
> expressed
> >>>> value depends on the local time zone.
> >>>>         Pros: (1) Make Flink SQL more close to SQL standard  (2) Can
> deal
> >>>> the conversion between Table/DataStream well
> >>>>         Cons: (1) We need to discuss the return value/type of
> >>> CURRENT_TIME
> >>>> function (2) The change is bigger to users, we need to support
> TIMESTAMP
> >>>> WITH LOCAL TIME ZONE in connectors/formats as well as custom
> connectors.
> >>>>                    (3)The TIMESTAMP WITH LOCAL TIME ZONE support is
> weak
> >>>> in Flink, thus we need some improvement,but the workload does not
> matter
> >>>> as long as we are doing the right thing ^_^
> >>>>
> >>>> Due to the above bad case for option (1). I think option 2 should be
> >>>> adopted,
> >>>> But we also need to consider some problems:
> >>>> (1) More conversion classes like LocalDateTime, sql.Timestamp should
> be
> >>>> supported for LocalZonedTimestampType to resolve the UDF compatibility
> >>> issue
> >>>> (2) The timezone offset for window size of one day should still be
> >>>> considered
> >>>> (3) All connectors/formats should supports TIMESTAMP WITH LOCAL TIME
> ZONE
> >>>> well and we also should record in document
> >>>> I’ll update these sections of FLIP-162.
> >>>>
> >>>>
> >>>>
> >>>> We also need to discuss the CURRENT_TIME function. I know the standard
> >>> way
> >>>> is using TIME WITH TIME ZONE(there's no TIME WITH LOCAL TIME ZONE),
> but
> >>> we
> >>>> don't support this type yet and I don't see strong motivation to
> support
> >>> it
> >>>> so far.
> >>>> Compared to CURRENT_TIMESTAMP, the CURRENT_TIME can not represent an
> >>>> absolute time point which should be considered as a string consisting
> of
> >>> a
> >>>> time with 'HH:mm:ss' format and time zone info.  We have several
> options
> >>>> for this:
> >>>> (1) We can forbid CURRENT_TIME as @Timo proposed to make all Flink SQL
> >>>> functions follow the standard well,  in this way, we need to offer
> some
> >>>> guidance for user upgrading Flink versions.
> >>>> (2) We can also support it from a user's perspective who has used
> >>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP, btw,Snowflake also
> returns
> >>>> TIME type.
> >>>> (3) Returns TIMESTAMP WITH LOCAL TIME ZONE to make it equal to
> >>>> CURRENT_TIMESTAMP as Calcite did.
> >>>>
> >>>> I can image (1) which we don't want to left a bad smell in Flink SQL,
> >>> and
> >>>> I also accept (2) because I think users do not consider time zone
> issues
> >>>> when they use CURRENT_DATE/CURRENT_TIME, and the timezone info in
> time is
> >>>> not very useful.
> >>>>
> >>>> I don’t have a strong opinion  for them.  What do others think?
> >>>>
> >>>>
> >>>> I hope I've addressed your concerns. @Timo @Kurt
> >>>>
> >>>> Best,
> >>>> Leonard
> >>>>
> >>>>
> >>>>
> >>>>> Most of the mature systems have a clear difference between
> >>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. I wouldn't take Spark or Hive
> as a
> >>>> good example. Snowflake decided for TIMESTAMP WITH LOCAL TIME ZONE.
> As I
> >>>> mentioned in the last comment, I could also imagine this behavior for
> >>>> Flink. But in any case, there should be some time zone information
> >>>> considered in order to cast to all other types.
> >>>>>
> >>>>>>>> The function CURRENT_DATE/CURRENT_TIME is supporting in SQL
> >>>> standard, but
> >>>>>>>> LOCALDATE not, I don’t think it’s a good idea that dropping
> >>>> functions which
> >>>>>>>> SQL standard supported and introducing a replacement which SQL
> >>>> standard not
> >>>>>>>> reminded.
> >>>>>
> >>>>> We can still add those functions in the future. But since we don't
> >>> offer
> >>>> a TIME WITH TIME ZONE, it is better to not support this function at
> all
> >>> for
> >>>> now. And by the way, this is exactly the behavior that also Microsoft
> SQL
> >>>> Server does: it also just supports CURRENT_TIMESTAMP (but it returns
> >>>> TIMESTAMP without a zone which completes the confusion).
> >>>>>
> >>>>>>>> I also agree returning  TIMESTAMP WITH LOCAL TIME ZONE for
> PROCTIME
> >>>> has
> >>>>>>>> more clear semantics, but I realized that user didn’t care the
> type
> >>>> but
> >>>>>>>> more about the expressed value they saw, and change the type from
> >>>> TIMESTAMP
> >>>>>>>> to TIMESTAMP WITH LOCAL TIME ZONE brings huge refactor that we
> need
> >>>>>>>> consider all places where the TIMESTAMP type used
> >>>>>
> >>>>> From a UDF perspective, I think nothing will change. The new type
> >>> system
> >>>> and type inference were designed to support all these cases. There is
> a
> >>>> reason why Java has adopted Joda time, because it is hard to come up
> >>> with a
> >>>> good time library. That's why also we and the other Hadoop ecosystem
> >>> folks
> >>>> have decided for 3 different kinds of LocalDateTime, ZonedDateTime,
> and
> >>>> Instance. It makes the library more complex, but time is a complex
> topic.
> >>>>>
> >>>>> I also doubt that many users work with only one time zone. Take the
> US
> >>>> as an example, a country with 3 different timezones. Somebody working
> >>> with
> >>>> US data cannot properly see the data points with just LOCAL TIME ZONE.
> >>> But
> >>>> on the other hand, a lot of event data is stored using a UTC
> timestamp.
> >>>>>
> >>>>>
> >>>>>>> Before jumping into technique details, let's take a step back to
> >>>> discuss
> >>>>>>> user experience.
> >>>>>>>
> >>>>>>> The first important question is what kind of date and time will
> >>> Flink
> >>>>>>> display when users call
> >>>>>>>   CURRENT_TIMESTAMP and maybe also PROCTIME (if we think they are
> >>>> similar).
> >>>>>>>
> >>>>>>> Should it always display the date and time in UTC or in the user's
> >>>> time
> >>>>>>> zone?
> >>>>>
> >>>>> @Kurt: I think we all agree that the current behavior with just
> showing
> >>>> UTC is wrong. Also, we all agree that when calling CURRENT_TIMESTAMP
> or
> >>>> PROCTIME a user would like to see the time in it's current time zone.
> >>>>>
> >>>>> As you said, "my wall clock time".
> >>>>>
> >>>>> However, the question is what is the data type of what you "see". If
> >>> you
> >>>> pass this record on to a different system, operator, or different
> >>> cluster,
> >>>> should the "my" get lost or materialized into the record?
> >>>>>
> >>>>> TIMESTAMP -> completely lost and could cause confusion in a different
> >>>> system
> >>>>>
> >>>>> TIMESTAMP WITH LOCAL TIME ZONE -> at least the UTC is correct, so you
> >>>> can provide a new local time zone
> >>>>>
> >>>>> TIMESTAMP WITH TIME ZONE -> also "your" location is persisted
> >>>>>
> >>>>> Regards,
> >>>>> Timo
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 22.01.21 09:38, Kurt Young wrote:
> >>>>>> Forgot one more thing. Continue with displaying in UTC. As a user,
> if
> >>>> Flink
> >>>>>> want to display the timestamp
> >>>>>> in UTC, why don't we offer something like UTC_TIMESTAMP?
> >>>>>> Best,
> >>>>>> Kurt
> >>>>>> On Fri, Jan 22, 2021 at 4:33 PM Kurt Young <ykt...@gmail.com>
> wrote:
> >>>>>>> Before jumping into technique details, let's take a step back to
> >>>> discuss
> >>>>>>> user experience.
> >>>>>>>
> >>>>>>> The first important question is what kind of date and time will
> Flink
> >>>>>>> display when users call
> >>>>>>>  CURRENT_TIMESTAMP and maybe also PROCTIME (if we think they are
> >>>> similar).
> >>>>>>>
> >>>>>>> Should it always display the date and time in UTC or in the user's
> >>> time
> >>>>>>> zone? I think this part is the
> >>>>>>> reason that surprised lots of users. If we forget about the type
> and
> >>>>>>> internal representation of these
> >>>>>>> two methods, as a user, my instinct tells me that these two methods
> >>>> should
> >>>>>>> display my wall clock time.
> >>>>>>>
> >>>>>>> Display time in UTC? I'm not sure, why I should care about UTC
> time?
> >>> I
> >>>>>>> want to get my current timestamp.
> >>>>>>> For those users who have never gone abroad, they might not even be
> >>>> able to
> >>>>>>> realize that this is affected
> >>>>>>> by the time zone.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Kurt
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Jan 22, 2021 at 12:25 PM Leonard Xu <xbjt...@gmail.com>
> >>> wrote:
> >>>>>>>
> >>>>>>>> Thanks @Timo for the detailed reply, let's go on this topic on
> this
> >>>>>>>> discussion,  I've merged all mails to this discussion.
> >>>>>>>>
> >>>>>>>>> LOCALDATE / LOCALTIME / LOCALTIMESTAMP
> >>>>>>>>>
> >>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP
> >>>>>>>>>
> >>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
> >>>>>>>>>
> >>>>>>>>> I'm very sceptical about this behavior. Almost all mature systems
> >>>>>>>> (Oracle, Postgres) and new high quality systems (Presto,
> Snowflake)
> >>>> use a
> >>>>>>>> data type with some degree of time zone information encoded. In a
> >>>>>>>> globalized world with businesses spanning different regions, I
> think
> >>>> we
> >>>>>>>> should do this as well. There should be a difference between
> >>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. And users should be able to
> >>>> choose
> >>>>>>>> which behavior they prefer for their pipeline.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I know that the two series should be different at first glance,
> but
> >>>>>>>> different SQL engines can have their own explanations,for example,
> >>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP are synonyms in Snowflake[1]
> >>> and
> >>>> has
> >>>>>>>> no difference, and Spark only supports the later one and doesn’t
> >>>> support
> >>>>>>>> LOCALTIME/LOCALTIMESTAMP[2].
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> If we would design this from scatch, I would suggest the
> following:
> >>>>>>>>>
> >>>>>>>>> - drop CURRENT_DATE / CURRENT_TIME and let users pick LOCALDATE /
> >>>>>>>> LOCALTIME for materialized timestamp parts
> >>>>>>>>
> >>>>>>>> The function CURRENT_DATE/CURRENT_TIME is supporting in SQL
> >>> standard,
> >>>> but
> >>>>>>>> LOCALDATE not, I don’t think it’s a good idea that dropping
> >>> functions
> >>>> which
> >>>>>>>> SQL standard supported and introducing a replacement which SQL
> >>>> standard not
> >>>>>>>> reminded.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> - CURRENT_TIMESTAMP should return a TIMESTAMP WITH TIME ZONE to
> >>>>>>>> materialize all session time information into every record. It it
> >>> the
> >>>> most
> >>>>>>>> generic data type and allows to cast to all other timestamp data
> >>>> types.
> >>>>>>>> This generic ability can be used for filter predicates as well
> >>> either
> >>>>>>>> through implicit or explicit casting.
> >>>>>>>>
> >>>>>>>> TIMESTAMP WITH TIME ZONE indeed contains more information to
> >>> describe
> >>>> a
> >>>>>>>> time point, but the type TIMESTAMP  can cast to all other
> timestamp
> >>>> data
> >>>>>>>> types combining with session time zone as well, and it also can be
> >>>> used for
> >>>>>>>> filter predicates. For type casting between BIGINT and TIMESTAMP,
> I
> >>>> think
> >>>>>>>> the function way using TO_TIMEMTAMP()/FROM_UNIXTIMESTAMP() is more
> >>>> clear.
> >>>>>>>>
> >>>>>>>>> PROCTIME/ROWTIME should be time functions based on a long value.
> >>> Both
> >>>>>>>> System.currentMillis() and our watermark system work on long
> values.
> >>>> Those
> >>>>>>>> should return TIMESTAMP WITH LOCAL TIME ZONE because the main
> >>>> calculation
> >>>>>>>> should always happen based on UTC.
> >>>>>>>>> We discussed it in a different thread, but we should allow
> PROCTIME
> >>>>>>>> globally. People need a way to create instances of TIMESTAMP WITH
> >>>> LOCAL
> >>>>>>>> TIME ZONE. This is not considered in the current design doc.
> >>>>>>>>> Many pipelines contain UTC timestamps and thus it should be easy
> to
> >>>>>>>> create one.
> >>>>>>>>> Also, both CURRENT_TIMESTAMP and LOCALTIMESTAMP can work with
> this
> >>>> type
> >>>>>>>> because we should remember that TIMESTAMP WITH LOCAL TIME ZONE
> >>>> accepts all
> >>>>>>>> timestamp data types as casting target [1]. We could allow
> TIMESTAMP
> >>>> WITH
> >>>>>>>> TIME ZONE in the future for ROWTIME.
> >>>>>>>>> In any case, windows should simply adapt their behavior to the
> >>> passed
> >>>>>>>> timestamp type. And with TIMESTAMP WITH LOCAL TIME ZONE a day is
> >>>> defined by
> >>>>>>>> considering the current session time zone.
> >>>>>>>>
> >>>>>>>> I also agree returning  TIMESTAMP WITH LOCAL TIME ZONE for
> PROCTIME
> >>>> has
> >>>>>>>> more clear semantics, but I realized that user didn’t care the
> type
> >>>> but
> >>>>>>>> more about the expressed value they saw, and change the type from
> >>>> TIMESTAMP
> >>>>>>>> to TIMESTAMP WITH LOCAL TIME ZONE brings huge refactor that we
> need
> >>>>>>>> consider all places where the TIMESTAMP type used, and many
> builtin
> >>>>>>>> functions and UDFs doest not support  TIMESTAMP WITH LOCAL TIME
> ZONE
> >>>> type.
> >>>>>>>> That means both user and Flink devs need to refactor the code(UDF,
> >>>> builtin
> >>>>>>>> functions, sql pipeline), to be honest, I didn’t see strong
> >>>> motivation that
> >>>>>>>> we have to do the pretty big refactor from user’s perspective and
> >>>>>>>> developer’s perspective.
> >>>>>>>>
> >>>>>>>> In one word, both your suggestion and my proposal can resolve
> almost
> >>>> all
> >>>>>>>> user problems,the divergence is whether we need to spend pretty
> >>>> energy just
> >>>>>>>> to get a bit more accurate semantics?   I think we need a
> tradeoff.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Leonard
> >>>>>>>> [1]
> >>>>>>>>
> >>>>
> https://trino.io/docs/current/functions/datetime.html#current_timestamp
> >>> <
> >>>>>>>>
> >>>>
> https://trino.io/docs/current/functions/datetime.html#current_timestamp>
> >>>>>>>> [2] https://issues.apache.org/jira/browse/SPARK-30374 <
> >>>>>>>> https://issues.apache.org/jira/browse/SPARK-30374>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>  2021-01-22,00:53,Timo Walther <twal...@apache.org> :
> >>>>>>>>>
> >>>>>>>>> Hi Leonard,
> >>>>>>>>>
> >>>>>>>>> thanks for working on this topic. I agree that time handling is
> not
> >>>>>>>> easy in Flink at the moment. We added new time data types (and
> some
> >>>> are
> >>>>>>>> still not supported which even further complicates things like
> >>>> TIME(9)). We
> >>>>>>>> should definitely improve this situation for users.
> >>>>>>>>>
> >>>>>>>>> This is a pretty opinionated topic and it seems that the SQL
> >>> standard
> >>>>>>>> is not really deciding this but is at least supporting. So let me
> >>>> express
> >>>>>>>> my opinion for the most important functions:
> >>>>>>>>>
> >>>>>>>>> LOCALDATE / LOCALTIME / LOCALTIMESTAMP
> >>>>>>>>>
> >>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
> >>>>>>>>>
> >>>>>>>>> I think those are the most obvious ones because the LOCAL
> indicates
> >>>>>>>> that the locality should be materialized into the result and any
> >>> time
> >>>> zone
> >>>>>>>> information (coming from session config or data) is not important
> >>>>>>>> afterwards.
> >>>>>>>>>
> >>>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP
> >>>>>>>>>
> >>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
> >>>>>>>>>
> >>>>>>>>> I'm very sceptical about this behavior. Almost all mature systems
> >>>>>>>> (Oracle, Postgres) and new high quality systems (Presto,
> Snowflake)
> >>>> use a
> >>>>>>>> data type with some degree of time zone information encoded. In a
> >>>>>>>> globalized world with businesses spanning different regions, I
> think
> >>>> we
> >>>>>>>> should do this as well. There should be a difference between
> >>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. And users should be able to
> >>>> choose
> >>>>>>>> which behavior they prefer for their pipeline.
> >>>>>>>>>
> >>>>>>>>> If we would design this from scatch, I would suggest the
> following:
> >>>>>>>>>
> >>>>>>>>> - drop CURRENT_DATE / CURRENT_TIME and let users pick LOCALDATE /
> >>>>>>>> LOCALTIME for materialized timestamp parts
> >>>>>>>>>
> >>>>>>>>> - CURRENT_TIMESTAMP should return a TIMESTAMP WITH TIME ZONE to
> >>>>>>>> materialize all session time information into every record. It it
> >>> the
> >>>> most
> >>>>>>>> generic data type and allows to cast to all other timestamp data
> >>>> types.
> >>>>>>>> This generic ability can be used for filter predicates as well
> >>> either
> >>>>>>>> through implicit or explicit casting.
> >>>>>>>>>
> >>>>>>>>> PROCTIME/ROWTIME should be time functions based on a long value.
> >>> Both
> >>>>>>>> System.currentMillis() and our watermark system work on long
> values.
> >>>> Those
> >>>>>>>> should return TIMESTAMP WITH LOCAL TIME ZONE because the main
> >>>> calculation
> >>>>>>>> should always happen based on UTC. We discussed it in a different
> >>>> thread,
> >>>>>>>> but we should allow PROCTIME globally. People need a way to create
> >>>>>>>> instances of TIMESTAMP WITH LOCAL TIME ZONE. This is not
> considered
> >>>> in the
> >>>>>>>> current design doc. Many pipelines contain UTC timestamps and thus
> >>> it
> >>>>>>>> should be easy to create one. Also, both CURRENT_TIMESTAMP and
> >>>>>>>> LOCALTIMESTAMP can work with this type because we should remember
> >>> that
> >>>>>>>> TIMESTAMP WITH LOCAL TIME ZONE accepts all timestamp data types as
> >>>> casting
> >>>>>>>> target [1]. We could allow TIMESTAMP WITH TIME ZONE in the future
> >>> for
> >>>>>>>> ROWTIME.
> >>>>>>>>>
> >>>>>>>>> In any case, windows should simply adapt their behavior to the
> >>> passed
> >>>>>>>> timestamp type. And with TIMESTAMP WITH LOCAL TIME ZONE a day is
> >>>> defined by
> >>>>>>>> considering the current session time zone.
> >>>>>>>>>
> >>>>>>>>> If we would like to design this with less effort required, we
> could
> >>>>>>>> think about returning TIMESTAMP WITH LOCAL TIME ZONE also for
> >>>>>>>> CURRENT_TIMESTAMP.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I will try to involve more people into this discussion.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Timo
> >>>>>>>>>
> >>>>>>>>> [1]
> >>>>>>>>
> >>>>
> >>>
> https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Data-Types.html#GUID-E7CA339A-2093-4FE4-A36E-1D09593591D3
> >>>>>>>> <
> >>>>>>>>
> >>>>
> >>>
> https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Data-Types.html#GUID-E7CA339A-2093-4FE4-A36E-1D09593591D3
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>  2021-01-21,22:32,Leonard Xu <xbjt...@gmail.com> :
> >>>>>>>>>> Before the changes, as I am writing this reply, the local time
> >>> here
> >>>> is
> >>>>>>>> 2021-01-21 12:03:35 (Beijing time, UTC+8).
> >>>>>>>>>> And I tried these 5 functions in sql client, and got:
> >>>>>>>>>>
> >>>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
> >>>> CURRENT_DATE,
> >>>>>>>> CURRENT_TIME;
> >>>>>>>>>>
> >>>>>>>>
> >>>>
> >>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
> >>>>>>>>  CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> >>>>>>>>>>
> >>>>>>>>
> >>>>
> >>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 |
> >>>>>>>> 2021-01-21T04:03:35.228 |   2021-01-21 | 04:03:35.228 |
> >>>>>>>>>>
> >>>>>>>>
> >>>>
> >>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>> After the changes, the expected behavior will change to:
> >>>>>>>>>>
> >>>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
> >>>> CURRENT_DATE,
> >>>>>>>> CURRENT_TIME;
> >>>>>>>>>>
> >>>>>>>>
> >>>>
> >>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
> >>>>>>>>  CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> >>>>>>>>>>
> >>>>>>>>
> >>>>
> >>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 |
> >>>>>>>> 2021-01-21T12:03:35.228 |   2021-01-21 | 12:03:35.228 |
> >>>>>>>>>>
> >>>>>>>>
> >>>>
> >>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>>> The return type of now(), proctime() and CURRENT_TIMESTAMP still
> >>> be
> >>>>>>>> TIMESTAMP;
> >>>>>>>>>
> >>>>>>>>> To Kurt, thanks  for the intuitive case, it really clear, you’re
> >>>> wright
> >>>>>>>> that I want to propose to change the return value of these
> >>> functions.
> >>>> It’s
> >>>>>>>> the most important part of the topic from user's perspective.
> >>>>>>>>>
> >>>>>>>>>> I think this definitely deserves a FLIP.
> >>>>>>>>> To Jark,  nice suggestion, I prepared a FLIP for this topic, and
> >>> will
> >>>>>>>> start the FLIP discussion soon.
> >>>>>>>>>
> >>>>>>>>>>> If use the default Flink SQL,&nbsp; the window time range of
> the
> >>>>>>>>>>> statistics is incorrect, then the statistical results will
> >>>> naturally
> >>>>>>>> be
> >>>>>>>>>>> incorrect.
> >>>>>>>>> To zhisheng, sorry to hear that this problem influenced your
> >>>> production
> >>>>>>>> jobs,  Could you share your SQL pattern?  we can have more inputs
> >>> and
> >>>> try
> >>>>>>>> to resolve them.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Leonard
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>  2021-01-21,14:19,Jark Wu <imj...@gmail.com> :
> >>>>>>>>>
> >>>>>>>>> Great examples to understand the problem and the proposed
> changes,
> >>>>>>>> @Kurt!
> >>>>>>>>>
> >>>>>>>>> Thanks Leonard for investigating this problem.
> >>>>>>>>> The time-zone problems around time functions and windows have
> >>>> bothered a
> >>>>>>>>> lot of users. It's time to fix them!
> >>>>>>>>>
> >>>>>>>>> The return value changes sound reasonable to me, and keeping the
> >>>> return
> >>>>>>>>> type unchanged will minimize the surprise to the users.
> >>>>>>>>> Besides that, I think it would be better to mention how this
> >>> affects
> >>>> the
> >>>>>>>>> window behaviors, and the interoperability with DataStream.
> >>>>>>>>>
> >>>>>>>>> I think this definitely deserves a FLIP.
> >>>>>>>>>
> >>>>>>>>> ====================================================
> >>>>>>>>>
> >>>>>>>>> Hi zhisheng,
> >>>>>>>>>
> >>>>>>>>> Do you have examples to illustrate which case will get the wrong
> >>>> window
> >>>>>>>>> boundaries?
> >>>>>>>>> That will help to verify whether the proposed changes can solve
> >>> your
> >>>>>>>>> problem.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Jark
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> 2021-01-21,12:54,zhisheng <173855...@qq.com> :
> >>>>>>>>>
> >>>>>>>>> Thanks to Leonard Xu for discussing this tricky topic. At
> present,
> >>>>>>>> there are many Flink jobs in our production environment that are
> >>> used
> >>>> to
> >>>>>>>> count day-level reports (eg: count PV/UV ).&nbsp;
> >>>>>>>>>
> >>>>>>>>> If use the default Flink SQL,&nbsp; the window time range of the
> >>>>>>>> statistics is incorrect, then the statistical results will
> naturally
> >>>> be
> >>>>>>>> incorrect.&nbsp;
> >>>>>>>>>
> >>>>>>>>> The user needs to deal with the time zone manually in order to
> >>> solve
> >>>>>>>> the problem.&nbsp;
> >>>>>>>>>
> >>>>>>>>> If Flink itself can solve these time zone issues, then I think it
> >>>> will
> >>>>>>>> be user-friendly.
> >>>>>>>>>
> >>>>>>>>> Thank you
> >>>>>>>>>
> >>>>>>>>> Best!;
> >>>>>>>>> zhisheng
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>  2021-01-21,12:11,Kurt Young <ykt...@gmail.com> :
> >>>>>>>>>
> >>>>>>>>> cc this to user & user-zh mailing list because this will affect
> >>> lots
> >>>> of
> >>>>>>>> users, and also quite a lot of users
> >>>>>>>>> were asking questions around this topic.
> >>>>>>>>>
> >>>>>>>>> Let me try to understand this from user's perspective.
> >>>>>>>>>
> >>>>>>>>> Your proposal will affect five functions, which are:
> >>>>>>>>> PROCTIME()
> >>>>>>>>> NOW()
> >>>>>>>>> CURRENT_DATE
> >>>>>>>>> CURRENT_TIME
> >>>>>>>>> CURRENT_TIMESTAMP
> >>>>>>>>> Before the changes, as I am writing this reply, the local time
> here
> >>>> is
> >>>>>>>> 2021-01-21 12:03:35 (Beijing time, UTC+8).
> >>>>>>>>> And I tried these 5 functions in sql client, and got:
> >>>>>>>>>
> >>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
> >>> CURRENT_DATE,
> >>>>>>>> CURRENT_TIME;
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
> >>>>>>>>  CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 |
> >>>>>>>> 2021-01-21T04:03:35.228 |   2021-01-21 | 04:03:35.228 |
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>> After the changes, the expected behavior will change to:
> >>>>>>>>>
> >>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
> >>> CURRENT_DATE,
> >>>>>>>> CURRENT_TIME;
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
> >>>>>>>>  CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 |
> >>>>>>>> 2021-01-21T12:03:35.228 |   2021-01-21 | 12:03:35.228 |
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>> The return type of now(), proctime() and CURRENT_TIMESTAMP still
> be
> >>>>>>>> TIMESTAMP;
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Kurt
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >
>
>

Reply via email to