[ 
https://issues.apache.org/jira/browse/IMPALA-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17742844#comment-17742844
 ] 

ASF subversion and git services commented on IMPALA-8721:
---------------------------------------------------------

Commit fbd8664b6b4d4b5d3df4290dc2309227803e245c in impala's branch 
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=fbd8664b6 ]

IMPALA-12275: Read files written with DeflateCodec

DeflateCodec is an alias to DefaultCodec. Impala works with
DefaultCodec. Fixes reading files written with DeflateCodec.

DeflateCodec isn't an issue with text files because they don't include a
codec header. Sequence files do, which we check on decompress.

Moves TestTextInterop to a E2E test since it doesn't require any special
startup options and refactors out test running to be format-agnostic.
Updates text file test as IMPALA-8721 is fixed. Removes creating a table
in Impala for Hive to read, as it didn't test anything new. Adds tests
for sequence files; excludes reading zstd due to IMPALA-12276.

Testing:
- manual exhaustive run of updated tests

Change-Id: Id5ec1d0345ae35597f6aade9d8b9eef2257efeba
Reviewed-on: http://gerrit.cloudera.org:8080/20181
Reviewed-by: Joe McDonnell <joemcdonn...@cloudera.com>
Tested-by: Michael Smith <michael.sm...@cloudera.com>


> Wrong result when Impala reads a Hive written parquet TimeStamp column
> ----------------------------------------------------------------------
>
>                 Key: IMPALA-8721
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8721
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>            Reporter: Abhishek Rawat
>            Assignee: Tim Armstrong
>            Priority: Critical
>              Labels: Interoperability, correctness, hive, impala, parquet, 
> timestamp
>             Fix For: Impala 4.0.0
>
>
>  
> Easy to repro on latest upstream:
> {code:java}
> hive> create table t1_hive(c1 timestamp) stored as parquet;
> hive> insert into t1_hive values('2009-03-09 01:20:03.600000000');
> hive> select * from t1_hive;
> OK
> 2009-03-09 01:20:03.6
> [localhost:21000] default> invalidate metadata t1_hive;
> [localhost:21000] default> select * from t1_hive;
> Query: select * from t1_hive
> Query submitted at: 2019-06-24 09:55:36 (Coordinator: 
> http://optimus-prime:25000)
> Query progress can be monitored at: 
> http://optimus-prime:25000/query_plan?query_id=b34f85cb5da29c26:d4dfcb2400000000
> +-------------------------------+
> | c1 |
> +-------------------------------+
> | 2009-03-09 09:20:03.600000000 | <<<<<UTC
> +-------------------------------+
> bin/start-impala-cluster.py 
> --impalad_args='-convert_legacy_hive_parquet_utc_timestamps=true'
> [localhost:21000] default> select * from t1_hive;
> Query: select * from t1_hive
> Query submitted at: 2019-06-24 10:00:22 (Coordinator: 
> http://optimus-prime:25000)
> Query progress can be monitored at: 
> http://optimus-prime:25000/query_plan?query_id=d5428bb21fb259b9:7b10703400000000
> +-------------------------------+
> | c1 |
> +-------------------------------+
> | 2009-03-09 02:20:03.600000000 |. <<<<<<PST8PDT
> +-------------------------------+
>  
> {code}
>  
> This issue is causing testcase test_hive_impala_interop to fail. Untill this 
> issue is fixed, the testcase will be updated to not include a timestamp 
> column. The test case should be updated to include a timestamp column once 
> this issue is fixed.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to