[ https://issues.apache.org/jira/browse/IMPALA-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757273#comment-16757273 ]
Gabor Kaszab commented on IMPALA-4018: -------------------------------------- Hey, I have reflected to some of the points above on the code review but will cover them here as well (https://gerrit.cloudera.org/#/c/12267/). About the technical details of the current solution: - We currently have to_timestamp() and from_timestamp() to have both directions of the "string/varchar/char vs timestamp" conversions. - to_char() mentioned above sounds similar to from_timestamp() for me. - In theory these functions use the Java pattern for conversion but this might be misleading as the actual parsing of the pattern and conversion of values happen in BE C++. So there is an Impala specific implementation that is meant to reflect the Java pattern and we don't re-use a built-in Java library for this purpose. - Have a parsing algorithm in the FE is feasible but since the actual formatting has to happen in the BE we should have a quite similar implementation there as well and maintain both of them. In addition to this we have to send the parsed tokens at least between FE and BE in case we want to have the parsing in the FE. I still feel that delivering first CAST(..FORMAT..) with the Java pattern makes much sense. It was mentioned as a requirement that the new SQL pattern should be hidden by a feature flag as it changes how from_timestamp() and to_timestamp() work. Then it would be reasonable to have both the Java and SQL pattern available for the CAST(..FORMAT..) as well. I imagine that a situation where a feature flag changes to_timestamp() and from_timestamp() behaviour but doesn't change CAST(..FORMAT..) would lead to more misunderstandings of how Impala handles these patterns. In my opinion it would be a cleaner approach to have CAST(..FORMAT..) now to be in line with the other 2 conversion functions, and then introduce the new SQL pattern in one go for every function that uses these datetime patterns. (Introducing CAST(..FORMAT..) with Java pattern is already on review) > Add support for SQL:2016 datetime templates/patterns/masks to CAST(... AS ... > FORMAT <template>) > ------------------------------------------------------------------------------------------------ > > Key: IMPALA-4018 > URL: https://issues.apache.org/jira/browse/IMPALA-4018 > Project: IMPALA > Issue Type: New Feature > Components: Frontend > Affects Versions: Impala 2.2.4 > Reporter: Greg Rahn > Assignee: Gabor Kaszab > Priority: Critical > Labels: ansi-sql, compatibility, sql-language > > *Summary* > The format masks/templates for currently are implemented using the [Java > SimpleDateFormat > patterns|http://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html], > and although this is what Hive has implemented, it is not what most standard > SQL systems implement. For example see > [Vertica|https://my.vertica.com/docs/7.2.x/HTML/Content/Authoring/SQLReferenceManual/Functions/Formatting/TemplatePatternsForDateTimeFormatting.htm], > > [Netezza|http://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_ntz_sql_extns_templ_patterns_date_time_conv.html], > > [Oracle|https://docs.oracle.com/database/121/SQLRF/sql_elements004.htm#SQLRF00212], > and > [PostgreSQL|https://www.postgresql.org/docs/9.5/static/functions-formatting.html#FUNCTIONS-FORMATTING-DATETIME-TABLE]. > > *Examples of incompatibilities* > {noformat} > -- PostgreSQL/Netezza/Vertica/Oracle > select to_timestamp('May 15, 2015 12:00:00', 'mon dd, yyyy hh:mi:ss'); > -- Impala > select to_timestamp('May 15, 2015 12:00:00', 'MMM dd, yyyy HH:mm:ss'); > -- PostgreSQL/Netezza/Vertica/Oracle > select to_timestamp('2015-02-14 20:19:07','yyyy-mm-dd hh24:mi:ss'); > -- Impala > select to_timestamp('2015-02-14 20:19:07','yyyy-MM-dd HH:mm:ss'); > -- Vertica/Oracle > select to_timestamp('2015-02-14 20:19:07.123456','yyyy-mm-dd hh24:mi:ss.ff'); > -- Impala > select to_timestamp('2015-02-14 20:19:07.123456','yyyy-MM-dd > HH:mm:ss.SSSSSS'); > {noformat} > *Considerations* > Because this is a change in default behavior for to_timestamp(), if possible, > having a feature flag to revert to the legacy Java SimpleDateFormat patterns > should be strongly considered. This would allow users to chose the behavior > they desire and scope it to a session if need be. > SQL:2016 defines the following datetime templates > {noformat} > <datetime template> ::= > { <datetime template part> }... > <datetime template part> ::= > <datetime template field> > | <datetime template delimiter> > <datetime template field> ::= > <datetime template year> > | <datetime template rounded year> > | <datetime template month> > | <datetime template day of month> > | <datetime template day of year> > | <datetime template 12-hour> > | <datetime template 24-hour> > | <datetime template minute> > | <datetime template second of minute> > | <datetime template second of day> > | <datetime template fraction> > | <datetime template am/pm> > | <datetime template time zone hour> > | <datetime template time zone minute> > <datetime template delimiter> ::= > <minus sign> > | <period> > | <solidus> > | <comma> > | <apostrophe> > | <semicolon> > | <colon> > | <space> > <datetime template year> ::= > YYYY | YYY | YY | Y > <datetime template rounded year> ::= > RRRR | RR > <datetime template month> ::= > MM > <datetime template day of month> ::= > DD > <datetime template day of year> ::= > DDD > <datetime template 12-hour> ::= > HH | HH12 > <datetime template 24-hour> ::= > HH24 > <datetime template minute> ::= > MI > <datetime template second of minute> ::= > SS > <datetime template second of day> ::= > SSSSS > <datetime template fraction> ::= > FF1 | FF2 | FF3 | FF4 | FF5 | FF6 | FF7 | FF8 | FF9 > <datetime template am/pm> ::= > A.M. | P.M. > <datetime template time zone hour> ::= > TZH > <datetime template time zone minute> ::= > TZM > {noformat} > SQL:2016 also introduced the FORMAT clause for CAST which is the standard way > to do string <> datetime conversions > {noformat} > <cast specification> ::= > CAST <left paren> > <cast operand> AS <cast target> > [ FORMAT <cast template> ] > <right paren> > <cast operand> ::= > <value expression> > | <implicitly typed value specification> > <cast target> ::= > <domain name> > | <data type> > <cast template> ::= > <character string literal> > {noformat} > For example: > {noformat} > CAST(<datetime> AS <char string type> [FORMAT <template>]) > CAST(<char string> AS <datetime type> [FORMAT <template>]) > cast(dt as string format 'DD-MM-YYYY') > cast('01-05-2017' as date format 'DD-MM-YYYY') > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org