Hello Attila Jeges, Impala Public Jenkins, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/13722 to look at the new patch set (#6). Change subject: IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1 ...................................................................... IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1 This enhancement introduces FORMAT clause for CAST() operator that is applicable for casts between string types and timestamp types. Instead of accepting SimpleDateFormat patterns the FORMAT clause supports datetime patterns following the ISO:SQL:2016 standard. Note, the CAST() operator without the FORMAT clause still uses Impala's implementation of SimpleDateFormat handling. Similarly, the existing conversion functions such as to_timestamp(), from_timestamp() etc. remain unchanged and use SimpleDateFormat. Milestone 1 contains all the format tokens covered by the SQL standard. Further milestones will add more functionality on top of this list to cover functionality provided by other RDBMS systems. List of tokens implemented by this change: - YYYY, YYY, YY, Y: Year tokens - RRRR, RR: Round year tokens - MM: Month - DD: Day - DDD: Day of year - HH, HH12: Hour of day (1-12) - HH24: Hour of day (0-23) - MI: Minute - SS: Second - SSSSS: Second of day - FF, FF1, ..., FF9: Fractional second - AM, PM, A.M., P.M.: Meridiem indicators - TZH: Timezone hour - TZM: Timezone minute - Separators: - . / , ' ; : space - ISO8601 date indicators (T, Z) Some notes about the matching algorithm: - The parsing algorithm uses these tokens in a case insensitive manner. - The separators are interchangeable with each other. For example a '-' separator in the format will match with a '.' character in the input. - The length of the separator sequences is handled flexibly meaning that a single separator character in the format for instance would match with a multi-separator sequence in the input. - In a string type to timestamp conversion the timezone offset tokens are parsed, expected to match with the input but they don't adjust the result as the input is already expected to be in UTC format. Usage example: SELECT CAST('01-02-2019' AS TIMESTAMP FORMAT 'MM-DD-YYYY'); SELECT CAST('2019.10.10 13:30:40.123456 +01:30' AS TIMESTAMP FORMAT 'YYYY-MM-DD HH24:MI:SS.FF9 TZH:TZM'); SELECT CAST(timestamp_column as STRING FORMAT "YYYY MM HH12 YY") from some_table; Change-Id: I19d8d097a45ae6f103b6cd1b2d81aad38dfd9e23 --- M be/src/benchmarks/convert-timestamp-benchmark.cc M be/src/benchmarks/parse-timestamp-benchmark.cc M be/src/common/init.cc M be/src/exec/text-converter.inline.h M be/src/exprs/CMakeLists.txt A be/src/exprs/cast-expr.cc A be/src/exprs/cast-expr.h M be/src/exprs/cast-functions-ir.cc M be/src/exprs/date-functions-ir.cc M be/src/exprs/expr-test.cc M be/src/exprs/scalar-expr-evaluator.h M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/timestamp-functions-ir.cc M be/src/exprs/timestamp-functions.cc M be/src/exprs/timestamp-functions.h M be/src/runtime/CMakeLists.txt M be/src/runtime/date-parse-util.cc M be/src/runtime/date-parse-util.h M be/src/runtime/date-test.cc M be/src/runtime/date-value.cc M be/src/runtime/date-value.h A be/src/runtime/datetime-iso-sql-format-parser.cc A be/src/runtime/datetime-iso-sql-format-parser.h A be/src/runtime/datetime-iso-sql-format-tokenizer.cc A be/src/runtime/datetime-iso-sql-format-tokenizer.h D be/src/runtime/datetime-parse-util.h A be/src/runtime/datetime-parser-common.cc A be/src/runtime/datetime-parser-common.h R be/src/runtime/datetime-simple-date-format-parser.cc A be/src/runtime/datetime-simple-date-format-parser.h M be/src/runtime/runtime-state.cc M be/src/runtime/timestamp-parse-util.cc M be/src/runtime/timestamp-parse-util.h M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.cc M be/src/runtime/timestamp-value.h M be/src/service/impala-server.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/testutil/random-vector-generators.h M be/src/util/dict-test.cc M be/src/util/min-max-filter-test.cc M be/src/util/string-parser.h M common/thrift/Exprs.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java A testdata/workloads/functional-query/queries/QueryTest/cast_format_from_table.test M testdata/workloads/functional-query/queries/QueryTest/date.test A tests/query_test/test_cast_with_format.py 54 files changed, 3,363 insertions(+), 856 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/13722/6 -- To view, visit http://gerrit.cloudera.org:8080/13722 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I19d8d097a45ae6f103b6cd1b2d81aad38dfd9e23 Gerrit-Change-Number: 13722 Gerrit-PatchSet: 6 Gerrit-Owner: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>