Hello Gabor Kaszab, Zoltan Borok-Nagy, Csaba Ringhofer, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/9986 to look at the new patch set (#6). Change subject: IMPALA-3307: Add support for IANA time-zone db ...................................................................... IMPALA-3307: Add support for IANA time-zone db Impala currently uses two different libraries for timestamp manipulations: boost and glibc. Issues with boost: - Time-zone database is currently hard coded in timezone_db.cc. Impala admins cannot update it without upgrading Impala. - Time-zone database is flat, therefore can’t track year-to-year changes. - Time-zone database is not updated on a regular basis. Issues with glibc: - Uses /usr/share/zoneinfo/ database which could be out of sync on some of the nodes in the Impala cluster. - Uses the host system’s local time-zone. Different nodes in the Impala cluster might use a different local time-zone. - Conversion functions take a global lock, which causes severe performance degradation. In addition to the issues above, the fact that /usr/share/zoneinfo/ and the hard-coded boost time-zone database are both in use is a source of inconsistency in itself. This patch makes the following changes: - Instead of boost and glibc, impalad uses Google's CCTZ to implement time-zone conversions. - Introduces a new startup flag (--hdfs_zone_info_dir) to impalad to specify an HDFS/S3/ADLS location that contains the shared compiled IANA time-zone database. If the startup flag is set, impalad will use the specified time-zone database. Otherwise, impalad will use the default /usr/share/zoneinfo time-zone database. - impalad reads the entire time-zone database into an in-memory map on startup for fast lookups. - The name of the coordinator node’s local time-zone is saved to the query context when preparing query execution. This time-zone is used whenever the current time-zone is referred afterwards in an execution node. - Introduces a new startup flag (--hdfs_zone_abbrev_conf) to impalad to specify an HDFS/S3/ADLS path to a shared config file that contains definitions for non-standard time-zone abbreviations. Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77 --- M CMakeLists.txt M be/CMakeLists.txt M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/convert-timestamp-benchmark.cc M be/src/common/global-types.h M be/src/exec/data-source-scan-node.cc M be/src/exec/data-source-scan-node.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/parquet-column-readers.cc M be/src/exprs/CMakeLists.txt M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/cast-functions-ir.cc M be/src/exprs/decimal-operators-ir.cc M be/src/exprs/decimal-operators.h M be/src/exprs/expr-test.cc M be/src/exprs/literal.cc M be/src/exprs/timestamp-functions-ir.cc M be/src/exprs/timestamp-functions.cc A be/src/exprs/timezone_db-test.cc M be/src/exprs/timezone_db.cc M be/src/exprs/timezone_db.h M be/src/runtime/raw-value-test.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.cc M be/src/runtime/timestamp-value.h M be/src/runtime/timestamp-value.inline.h M be/src/service/impala-server.cc M be/src/service/impalad-main.cc M be/src/util/filesystem-util-test.cc M be/src/util/filesystem-util.cc M be/src/util/filesystem-util.h M be/src/util/hdfs-util-test.cc M be/src/util/hdfs-util.cc M be/src/util/hdfs-util.h M be/src/util/time-test.cc M be/src/util/time.cc M be/src/util/time.h M bin/bootstrap_toolchain.py M bin/impala-config.sh A cmake_modules/FindCctz.cmake M common/thrift/ImpalaInternalService.thrift M common/thrift/metrics.json M fe/src/test/java/org/apache/impala/testutil/TestUtils.java M testdata/bin/create-load-data.sh M testdata/data/timezoneverification.csv A testdata/tzdb/abbrev.conf A testdata/tzdb/zoneinfo/AmerICA/ArgeNTINA/MendOZA A testdata/tzdb/zoneinfo/AmerICA/CancUN A testdata/tzdb/zoneinfo/UTC M testdata/workloads/functional-query/queries/QueryTest/exprs.test A tests/custom_cluster/test_custom_tzdb.py 53 files changed, 2,569 insertions(+), 1,092 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/9986/6 -- To view, visit http://gerrit.cloudera.org:8080/9986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77 Gerrit-Change-Number: 9986 Gerrit-PatchSet: 6 Gerrit-Owner: Attila Jeges <atti...@cloudera.com> Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>