Rahul Challapalli created DRILL-4345: ----------------------------------------
Summary: Hive Native Reader reporting wrong results for timestamp column in hive generated parquet file Key: DRILL-4345 URL: https://issues.apache.org/jira/browse/DRILL-4345 Project: Apache Drill Issue Type: Bug Components: Storage - Hive, Storage - Parquet Reporter: Rahul Challapalli Priority: Critical git.commit.id.abbrev=1b96174 Below you can see different results returned from hive plugin and native reader for the same table. {code} 0: jdbc:drill:zk=10.10.100.190:5181> use hive; +-------+-----------------------------------+ | ok | summary | +-------+-----------------------------------+ | true | Default schema changed to [hive] | +-------+-----------------------------------+ 1 row selected (0.415 seconds) 0: jdbc:drill:zk=10.10.100.190:5181> select int_col, timestamp_col from hive1_fewtypes_null_parquet; +----------+------------------------+ | int_col | timestamp_col | +----------+------------------------+ | 1 | null | | null | 1997-01-02 00:00:00.0 | | 3 | null | | 4 | null | | 5 | 1997-02-10 17:32:00.0 | | 6 | 1997-02-11 17:32:01.0 | | 7 | 1997-02-12 17:32:01.0 | | 8 | 1997-02-13 17:32:01.0 | | 9 | null | | 10 | 1997-02-15 17:32:01.0 | | null | 1997-02-16 17:32:01.0 | | 12 | 1897-02-18 17:32:01.0 | | 13 | 2002-02-14 17:32:01.0 | | 14 | 1991-02-10 17:32:01.0 | | 15 | 1900-02-16 17:32:01.0 | | 16 | null | | null | 1897-02-16 17:32:01.0 | | 18 | 1997-02-16 17:32:01.0 | | null | null | | 20 | 1996-02-28 17:32:01.0 | | null | null | +----------+------------------------+ 21 rows selected (0.368 seconds) 0: jdbc:drill:zk=10.10.100.190:5181> alter session set `store.hive.optimize_scan_with_native_readers` = true; +-------+--------------------------------------------------------+ | ok | summary | +-------+--------------------------------------------------------+ | true | store.hive.optimize_scan_with_native_readers updated. | +-------+--------------------------------------------------------+ 1 row selected (0.213 seconds) 0: jdbc:drill:zk=10.10.100.190:5181> select int_col, timestamp_col from hive1_fewtypes_null_parquet; +----------+------------------------+ | int_col | timestamp_col | +----------+------------------------+ | 1 | null | | null | 1997-01-02 00:00:00.0 | | 3 | 1997-02-10 17:32:00.0 | | 4 | null | | 5 | 1997-02-11 17:32:01.0 | | 6 | 1997-02-12 17:32:01.0 | | 7 | 1997-02-13 17:32:01.0 | | 8 | 1997-02-15 17:32:01.0 | | 9 | 1997-02-16 17:32:01.0 | | 10 | 1900-02-16 17:32:01.0 | | null | 1897-02-16 17:32:01.0 | | 12 | 1997-02-16 17:32:01.0 | | 13 | 1996-02-28 17:32:01.0 | | 14 | 1997-01-02 00:00:00.0 | | 15 | 1997-01-02 00:00:00.0 | | 16 | 1997-01-02 00:00:00.0 | | null | 1997-01-02 00:00:00.0 | | 18 | 1997-01-02 00:00:00.0 | | null | 1997-01-02 00:00:00.0 | | 20 | 1997-01-02 00:00:00.0 | | null | 1997-01-02 00:00:00.0 | +----------+------------------------+ 21 rows selected (0.352 seconds) {code} DDL for hive table : {code} create external table hive1_fewtypes_null_parquet ( int_col int, bigint_col bigint, date_col string, time_col string, timestamp_col timestamp, interval_col string, varchar_col string, float_col float, double_col double, bool_col boolean ) stored as parquet location '/drill/testdata/hive_storage/hive1_fewtypes_null'; {code} Attached the underlying parquet file -- This message was sent by Atlassian JIRA (v6.3.4#6332)