Csaba Ringhofer has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14832
Change subject: IMPALA-8184: Add timestamp validation to Orc scanner ...................................................................... IMPALA-8184: Add timestamp validation to Orc scanner Hive can write timestamps that are outside Impala's valid range (Impala: 1400-9999 Hive: 0001-9999). This change adds validation logic to Orc reading that replaces out-of-range timestamps with NULLs and adds a warning to the query. The logic is very similar to the existing validation in Parquet. Some differences: - "time of day" is not checked separately as it doesn't make sense with Orc's encoding - instead of column name only column id added to the warning Testing: - added a simple EE test that scans an existing Orc file Change-Id: I8ee2ba83a54f93d37e8832e064f2c8418b503490 --- M be/src/exec/orc-column-readers.cc M common/thrift/generate_error_codes.py M testdata/data/README A testdata/data/out_of_range_timestamp.orc A testdata/workloads/functional-query/queries/DataErrorsTest/orc-out-of-range-timestamp.test M tests/query_test/test_scanners.py 6 files changed, 42 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/14832/1 -- To view, visit http://gerrit.cloudera.org:8080/14832 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I8ee2ba83a54f93d37e8832e064f2c8418b503490 Gerrit-Change-Number: 14832 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer <csringho...@cloudera.com>