[jira] [Commented] (IMPALA-8721) Wrong result when Impala reads a Hive written parquet TimeStamp column
[ https://issues.apache.org/jira/browse/IMPALA-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17742844#comment-17742844 ] ASF subversion and git services commented on IMPALA-8721: - Commit fbd8664b6b4d4b5d3df4290dc2309227803e245c in impala's branch refs/heads/master from Michael Smith [ https://gitbox.apache.org/repos/asf?p=impala.git;h=fbd8664b6 ] IMPALA-12275: Read files written with DeflateCodec DeflateCodec is an alias to DefaultCodec. Impala works with DefaultCodec. Fixes reading files written with DeflateCodec. DeflateCodec isn't an issue with text files because they don't include a codec header. Sequence files do, which we check on decompress. Moves TestTextInterop to a E2E test since it doesn't require any special startup options and refactors out test running to be format-agnostic. Updates text file test as IMPALA-8721 is fixed. Removes creating a table in Impala for Hive to read, as it didn't test anything new. Adds tests for sequence files; excludes reading zstd due to IMPALA-12276. Testing: - manual exhaustive run of updated tests Change-Id: Id5ec1d0345ae35597f6aade9d8b9eef2257efeba Reviewed-on: http://gerrit.cloudera.org:8080/20181 Reviewed-by: Joe McDonnell Tested-by: Michael Smith > Wrong result when Impala reads a Hive written parquet TimeStamp column > -- > > Key: IMPALA-8721 > URL: https://issues.apache.org/jira/browse/IMPALA-8721 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Abhishek Rawat >Assignee: Tim Armstrong >Priority: Critical > Labels: Interoperability, correctness, hive, impala, parquet, > timestamp > Fix For: Impala 4.0.0 > > > > Easy to repro on latest upstream: > {code:java} > hive> create table t1_hive(c1 timestamp) stored as parquet; > hive> insert into t1_hive values('2009-03-09 01:20:03.6'); > hive> select * from t1_hive; > OK > 2009-03-09 01:20:03.6 > [localhost:21000] default> invalidate metadata t1_hive; > [localhost:21000] default> select * from t1_hive; > Query: select * from t1_hive > Query submitted at: 2019-06-24 09:55:36 (Coordinator: > http://optimus-prime:25000) > Query progress can be monitored at: > http://optimus-prime:25000/query_plan?query_id=b34f85cb5da29c26:d4dfcb24 > +---+ > | c1 | > +---+ > | 2009-03-09 09:20:03.6 | +---+ > bin/start-impala-cluster.py > --impalad_args='-convert_legacy_hive_parquet_utc_timestamps=true' > [localhost:21000] default> select * from t1_hive; > Query: select * from t1_hive > Query submitted at: 2019-06-24 10:00:22 (Coordinator: > http://optimus-prime:25000) > Query progress can be monitored at: > http://optimus-prime:25000/query_plan?query_id=d5428bb21fb259b9:7b107034 > +---+ > | c1 | > +---+ > | 2009-03-09 02:20:03.6 |. < +---+ > > {code} > > This issue is causing testcase test_hive_impala_interop to fail. Untill this > issue is fixed, the testcase will be updated to not include a timestamp > column. The test case should be updated to include a timestamp column once > this issue is fixed. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8721) Wrong result when Impala reads a Hive written parquet TimeStamp column
[ https://issues.apache.org/jira/browse/IMPALA-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282137#comment-17282137 ] ASF subversion and git services commented on IMPALA-8721: - Commit 1f7b413d11321bd74aaa1a9ea9ed30e4d80d in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=1f7b413 ] IMPALA-8721: re-enable test_hive_impala_interop The test now passes because HIVE-21290 was fixed. Revert "IMPALA-8689: test_hive_impala_interop failing with "Timeout >7200s"" This reverts commit 5d8c99ce74c45a7d04f11e1f252b346d654f02bf. Change-Id: I7e2beabd7082a45a0fc3b60d318cf698079768ff Reviewed-on: http://gerrit.cloudera.org:8080/17042 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Wrong result when Impala reads a Hive written parquet TimeStamp column > -- > > Key: IMPALA-8721 > URL: https://issues.apache.org/jira/browse/IMPALA-8721 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Abhishek Rawat >Assignee: Tim Armstrong >Priority: Critical > Labels: Interoperability, correctness, hive, impala, parquet, > timestamp > > > Easy to repro on latest upstream: > {code:java} > hive> create table t1_hive(c1 timestamp) stored as parquet; > hive> insert into t1_hive values('2009-03-09 01:20:03.6'); > hive> select * from t1_hive; > OK > 2009-03-09 01:20:03.6 > [localhost:21000] default> invalidate metadata t1_hive; > [localhost:21000] default> select * from t1_hive; > Query: select * from t1_hive > Query submitted at: 2019-06-24 09:55:36 (Coordinator: > http://optimus-prime:25000) > Query progress can be monitored at: > http://optimus-prime:25000/query_plan?query_id=b34f85cb5da29c26:d4dfcb24 > +---+ > | c1 | > +---+ > | 2009-03-09 09:20:03.6 | +---+ > bin/start-impala-cluster.py > --impalad_args='-convert_legacy_hive_parquet_utc_timestamps=true' > [localhost:21000] default> select * from t1_hive; > Query: select * from t1_hive > Query submitted at: 2019-06-24 10:00:22 (Coordinator: > http://optimus-prime:25000) > Query progress can be monitored at: > http://optimus-prime:25000/query_plan?query_id=d5428bb21fb259b9:7b107034 > +---+ > | c1 | > +---+ > | 2009-03-09 02:20:03.6 |. < +---+ > > {code} > > This issue is causing testcase test_hive_impala_interop to fail. Untill this > issue is fixed, the testcase will be updated to not include a timestamp > column. The test case should be updated to include a timestamp column once > this issue is fixed. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8721) Wrong result when Impala reads a Hive written parquet TimeStamp column
[ https://issues.apache.org/jira/browse/IMPALA-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281518#comment-17281518 ] Tim Armstrong commented on IMPALA-8721: --- I think this was fixed by HIVE-21290 - the test passes now if I revert IMPALA-8689 > Wrong result when Impala reads a Hive written parquet TimeStamp column > -- > > Key: IMPALA-8721 > URL: https://issues.apache.org/jira/browse/IMPALA-8721 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Abhishek Rawat >Assignee: Tim Armstrong >Priority: Critical > Labels: Interoperability, correctness, hive, impala, parquet, > timestamp > > > Easy to repro on latest upstream: > {code:java} > hive> create table t1_hive(c1 timestamp) stored as parquet; > hive> insert into t1_hive values('2009-03-09 01:20:03.6'); > hive> select * from t1_hive; > OK > 2009-03-09 01:20:03.6 > [localhost:21000] default> invalidate metadata t1_hive; > [localhost:21000] default> select * from t1_hive; > Query: select * from t1_hive > Query submitted at: 2019-06-24 09:55:36 (Coordinator: > http://optimus-prime:25000) > Query progress can be monitored at: > http://optimus-prime:25000/query_plan?query_id=b34f85cb5da29c26:d4dfcb24 > +---+ > | c1 | > +---+ > | 2009-03-09 09:20:03.6 | +---+ > bin/start-impala-cluster.py > --impalad_args='-convert_legacy_hive_parquet_utc_timestamps=true' > [localhost:21000] default> select * from t1_hive; > Query: select * from t1_hive > Query submitted at: 2019-06-24 10:00:22 (Coordinator: > http://optimus-prime:25000) > Query progress can be monitored at: > http://optimus-prime:25000/query_plan?query_id=d5428bb21fb259b9:7b107034 > +---+ > | c1 | > +---+ > | 2009-03-09 02:20:03.6 |. < +---+ > > {code} > > This issue is causing testcase test_hive_impala_interop to fail. Untill this > issue is fixed, the testcase will be updated to not include a timestamp > column. The test case should be updated to include a timestamp column once > this issue is fixed. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8721) Wrong result when Impala reads a Hive written parquet TimeStamp column
[ https://issues.apache.org/jira/browse/IMPALA-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876469#comment-16876469 ] ASF subversion and git services commented on IMPALA-8721: - Commit 5d8c99ce74c45a7d04f11e1f252b346d654f02bf in impala's branch refs/heads/master from Abhishek Rawat [ https://gitbox.apache.org/repos/asf?p=impala.git;h=5d8c99c ] IMPALA-8689: test_hive_impala_interop failing with "Timeout >7200s" The newly added Hive<->Impala interop test fails due to unexpected wrong results when reading TimeStamp column value written by Hive. The short term measure is to remove TimeStamp column from the interop tests. The original issue will be fixed by IMPALA-8721. Testing: Ran the testcase N number of times on both upstream and downstream code base. Change-Id: I148c79a31f9aada1b75614390434462d1e483f28 Reviewed-on: http://gerrit.cloudera.org:8080/13755 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Wrong result when Impala reads a Hive written parquet TimeStamp column > -- > > Key: IMPALA-8721 > URL: https://issues.apache.org/jira/browse/IMPALA-8721 > Project: IMPALA > Issue Type: Bug >Reporter: Abhishek Rawat >Priority: Major > Labels: Interoperability, hive, impala, parquet, timestamp > Fix For: Impala 3.3.0 > > > > Easy to repro on latest upstream: > {code:java} > hive> create table t1_hive(c1 timestamp) stored as parquet; > hive> insert into t1_hive values('2009-03-09 01:20:03.6'); > hive> select * from t1_hive; > OK > 2009-03-09 01:20:03.6 > [localhost:21000] default> invalidate metadata t1_hive; > [localhost:21000] default> select * from t1_hive; > Query: select * from t1_hive > Query submitted at: 2019-06-24 09:55:36 (Coordinator: > http://optimus-prime:25000) > Query progress can be monitored at: > http://optimus-prime:25000/query_plan?query_id=b34f85cb5da29c26:d4dfcb24 > +---+ > | c1 | > +---+ > | 2009-03-09 09:20:03.6 | +---+ > bin/start-impala-cluster.py > --impalad_args='-convert_legacy_hive_parquet_utc_timestamps=true' > [localhost:21000] default> select * from t1_hive; > Query: select * from t1_hive > Query submitted at: 2019-06-24 10:00:22 (Coordinator: > http://optimus-prime:25000) > Query progress can be monitored at: > http://optimus-prime:25000/query_plan?query_id=d5428bb21fb259b9:7b107034 > +---+ > | c1 | > +---+ > | 2009-03-09 02:20:03.6 |. < +---+ > > {code} > > This issue is causing testcase test_hive_impala_interop to fail. Untill this > issue is fixed, the testcase will be updated to not include a timestamp > column. The test case should be updated to include a timestamp column once > this issue is fixed. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org