[ https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15583114#comment-15583114 ]
ASF GitHub Bot commented on DRILL-4373: --------------------------------------- Github user vdiravka commented on a diff in the pull request: https://github.com/apache/drill/pull/600#discussion_r83710133 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java --- @@ -754,15 +764,45 @@ public void testImpalaParquetVarBinary_DictChange() throws Exception { compareParquetReadersColumnar("field_impala_ts", "cp.`parquet/int96_dict_change.parquet`"); } + @Test + public void testImpalaParquetBinaryTimeStamp_DictChange() throws Exception { + try { + test("alter session set %s = true", ExecConstants.PARQUET_READER_INT96_AS_TIMESTAMP); + compareParquetReadersColumnar("field_impala_ts", "cp.`parquet/int96_dict_change.parquet`"); --- End diff -- 1. Is it better to compare result with baseline columns and values from the file or it is ok to compare with `sqlBaselineQuery` and disabled new `PARQUET_READER_INT96_AS_TIMESTAMP` option? 2. In the process of investigating this test I found that the primitive data type of the column in the file `int96_dict_change.parquet` is BINARY, not INT96. I am a little bit confused with this. Do we need convert this BINARY to TIMESTAMP as well? CONVERT_FROM function with IMPALA_TIMESTAMP argument works properly for this field. I will investigate a little more about does impala and hive can store timestamps into parquet BINARY. > Drill and Hive have incompatible timestamp representations in parquet > --------------------------------------------------------------------- > > Key: DRILL-4373 > URL: https://issues.apache.org/jira/browse/DRILL-4373 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive, Storage - Parquet > Affects Versions: 1.8.0 > Reporter: Rahul Challapalli > Assignee: Karthikeyan Manivannan > Labels: doc-impacting > Fix For: 1.9.0 > > > git.commit.id.abbrev=83d460c > I created a parquet file with a timestamp type using Drill. Now if I define a > hive table on top of the parquet file and use "timestamp" as the column type, > drill fails to read the hive table through the hive storage plugin -- This message was sent by Atlassian JIRA (v6.3.4#6332)