[jira] [Created] (IMPALA-9116) SASL server fails when FQDN is greater than 63 characters long in Kudu RPC
Anurag Mantripragada created IMPALA-9116: Summary: SASL server fails when FQDN is greater than 63 characters long in Kudu RPC Key: IMPALA-9116 URL: https://issues.apache.org/jira/browse/IMPALA-9116 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.3.0 Reporter: Anurag Mantripragada Fix For: Impala 3.4.0 In the current Kudu RPC implementation, we don't explicitly pass the host's FQDN into the SASL library. Due to an upstream SASL bug ([https://github.com/cyrusimap/cyrus-sasl/issues/583]) the FQDN gets truncated when trying to determine the server's principal, in the case that the server's fQDN is longer than 64 characters. This results in startup failures where the preflight checks fail due to not finding the appropriate keytab entry (after searching for a truncated host name) To work around this, we should use our own code to compute the FQDN. Kudu is making the changes in it's own implementation here: https://issues.apache.org/jira/browse/KUDU-2989, we should do the same. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-9116) SASL server fails when FQDN is greater than 63 characters long in Kudu RPC
Anurag Mantripragada created IMPALA-9116: Summary: SASL server fails when FQDN is greater than 63 characters long in Kudu RPC Key: IMPALA-9116 URL: https://issues.apache.org/jira/browse/IMPALA-9116 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.3.0 Reporter: Anurag Mantripragada Fix For: Impala 3.4.0 In the current Kudu RPC implementation, we don't explicitly pass the host's FQDN into the SASL library. Due to an upstream SASL bug ([https://github.com/cyrusimap/cyrus-sasl/issues/583]) the FQDN gets truncated when trying to determine the server's principal, in the case that the server's fQDN is longer than 64 characters. This results in startup failures where the preflight checks fail due to not finding the appropriate keytab entry (after searching for a truncated host name) To work around this, we should use our own code to compute the FQDN. Kudu is making the changes in it's own implementation here: https://issues.apache.org/jira/browse/KUDU-2989, we should do the same. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IMPALA-3933) Time zone definitions of Hive/Spark and Impala differ for historical dates
[ https://issues.apache.org/jira/browse/IMPALA-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964611#comment-16964611 ] Manish Maheshwari commented on IMPALA-3933: --- So can we do something about this? I know not too many users will be using such old dates, but some do and this becomes one of the reasons to use Hive over Impala for BI queries. > Time zone definitions of Hive/Spark and Impala differ for historical dates > -- > > Key: IMPALA-3933 > URL: https://issues.apache.org/jira/browse/IMPALA-3933 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Affects Versions: impala 2.3 >Reporter: Adriano Simone >Priority: Minor > > How the TIMESTAMP skew with convert_legacy_hive_parquet_utc_timestamps=true > Enabling --convert_legacy_hive_parquet_utc_timestamps=true seems to cause > data skew (improper converting) upon the reading for dates earlier than 1900 > (not sure about the exact date). > The following example was run on a server which is in CEST timezone, thus the > time difference is GMT+1 for dates before 1900 (I'm not sure, I haven't > checked the exact starting date of DST computation), and GMT+2 when summer > daylight saving time was applied. > create table itst (col1 int, myts timestamp) stored as parquet; > From impala: > {code:java} > insert into itst values (1,'2016-04-15 12:34:45'); > insert into itst values (2,'1949-04-15 12:34:45'); > insert into itst values (3,'1753-04-15 12:34:45'); > insert into itst values (4,'1752-04-15 12:34:45'); > {code} > from hive > {code:java} > insert into itst values (5,'2016-04-15 12:34:45'); > insert into itst values (6,'1949-04-15 12:34:45'); > insert into itst values (7,'1753-04-15 12:34:45'); > insert into itst values (8,'1752-04-15 12:34:45'); > {code} > From impala > {code:java} > select * from itst order by col1; > {code} > Result: > {code:java} > Query: select * from itst > +--+-+ > | col1 | myts| > +--+-+ > | 1| 2016-04-15 12:34:45 | > | 2| 1949-04-15 12:34:45 | > | 3| 1753-04-15 12:34:45 | > | 4| 1752-04-15 12:34:45 | > | 5| 2016-04-15 10:34:45 | > | 6| 1949-04-15 10:34:45 | > | 7| 1753-04-15 11:34:45 | > | 8| 1752-04-15 11:34:45 | > +--+-+ > {code} > The timestamps are looking good, the DST differences can be seen (hive > inserted it in local time, but impala shows it in UTC) > From impala after setting the command line argument > "--convert_legacy_hive_parquet_utc_timestamps=true" > {code:java} > select * from itst order by col1; > {code} > The result in this case: > {code:java} > Query: select * from itst order by col1 > +--+-+ > | col1 | myts| > +--+-+ > | 1| 2016-04-15 12:34:45 | > | 2| 1949-04-15 12:34:45 | > | 3| 1753-04-15 12:34:45 | > | 4| 1752-04-15 12:34:45 | > | 5| 2016-04-15 12:34:45 | > | 6| 1949-04-15 12:34:45 | > | 7| 1753-04-15 12:51:05 | > | 8| 1752-04-15 12:51:05 | > +--+-+ > {code} > It seems that instead of 11:34:45 it is showing 12:51:05. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-9032) Impala returns 0 rows over hs2-http without waiting for fetch_rows_timeout_ms timeout
[ https://issues.apache.org/jira/browse/IMPALA-9032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964478#comment-16964478 ] Sahil Takiar commented on IMPALA-9032: -- The bug is here: [https://github.com/apache/impala/commit/151835116a7972b15a646f8eae6bd8a593bb3564#diff-56ca691d07bb5e79ea7a99aa180cbf91R136-R137] - {{PlanRootSink::fetch_rows_timeout_us()}} is in microseconds, but {{wait_timeout_timer.ElapsedTime()}} is in nanoseconds. This was fixed here: [https://github.com/apache/impala/commit/c47fca5960b5be1a8e2013c4c4ffe260e98a1bff#diff-56ca691d07bb5e79ea7a99aa180cbf91R143-R145] > Impala returns 0 rows over hs2-http without waiting for fetch_rows_timeout_ms > timeout > - > > Key: IMPALA-9032 > URL: https://issues.apache.org/jira/browse/IMPALA-9032 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.4.0 >Reporter: Lars Volker >Priority: Major > > This looks like a bug to me but I'm not entirely sure. I'm trying to run our > tests over hs2-http (IMPALA-8863) and after the change for IMPALA-7312 to > introduce a non-blocking mode for FetchResults() it looks like we sometimes > return an empty result way before {{fetch_rows_timeout_ms}} has elapsed. This > triggers a bug in Impyla > ([#369|https://github.com/cloudera/impyla/issues/369]), but it also seems > like something we should investigate and fix in Impala. > {noformat} > I1007 22:10:10.697760 56550 impala-hs2-server.cc:821] FetchResults(): > query_id=764d4313dbc64e20:2831560c fetch_size=1024I1007 > 22:10:10.697760 56550 impala-hs2-server.cc:821] FetchResults(): > query_id=764d4313dbc64e20:2831560c fetch_size=1024I1007 > 22:10:10.697988 56527 scheduler.cc:468] 6d4cba4d2e8ccc42:66ce26a8] > Exec at coord is falseI1007 22:10:10.698014 54090 impala-hs2-server.cc:663] > GetOperationStatus(): query_id=0d43fd73ce4403fd:da25dde9I1007 > 22:10:10.698173 127 control-service.cc:142] > 0646e91fd6a0a953:02949ff3] ExecQueryFInstances(): > query_id=0646e91fd6a0a953:02949ff3 coord=b04a12d76e27:22000 > #instances=1I1007 22:10:10.698356 56527 admission-controller.cc:1270] > 6d4cba4d2e8ccc42:66ce26a8] Trying to admit > id=6d4cba4d2e8ccc42:66ce26a8 in pool_name=root.default > executor_group_name=default per_host_mem_estimate=52.02 MB > dedicated_coord_mem_estimate=110.02 MB max_requests=-1 (configured > statically) max_queued=200 (configured statically) max_mem=29.30 GB > (configured statically)I1007 22:10:10.698386 56527 > admission-controller.cc:1282] 6d4cba4d2e8ccc42:66ce26a8] Stats: > agg_num_running=9, agg_num_queued=0, agg_mem_reserved=8.34 GB, > local_host(local_mem_admitted=9.09 GB, num_admitted_running=9, num_queued=0, > backend_mem_reserved=6.70 GB)I1007 22:10:10.698415 56527 > admission-controller.cc:871] 6d4cba4d2e8ccc42:66ce26a8] Admitting > query id=6d4cba4d2e8ccc42:66ce26a8I1007 22:10:10.698479 56527 > impala-server.cc:1713] 6d4cba4d2e8ccc42:66ce26a8] Registering query > locationsI1007 22:10:10.698529 56527 coordinator.cc:97] > 6d4cba4d2e8ccc42:66ce26a8] Exec() > query_id=6d4cba4d2e8ccc42:66ce26a8 stmt=select count(*) from alltypes > where month=1I1007 22:10:10.698992 56527 coordinator.cc:361] > 6d4cba4d2e8ccc42:66ce26a8] starting execution on 3 backends for > query_id=6d4cba4d2e8ccc42:66ce26a8I1007 22:10:10.699383 56523 > coordinator.cc:375] 0646e91fd6a0a953:02949ff3] started execution on 1 > backends for query_id=0646e91fd6a0a953:02949ff3I1007 22:10:10.699409 > 56534 scheduler.cc:468] e1495f928c2cd4f6:eeda82aa] Exec at coord is > falseI1007 22:10:10.700017 127 control-service.cc:142] > 6d4cba4d2e8ccc42:66ce26a8] ExecQueryFInstances(): > query_id=6d4cba4d2e8ccc42:66ce26a8 coord=b04a12d76e27:22000 > #instances=1I1007 22:10:10.700147 56534 scheduler.cc:468] > e1495f928c2cd4f6:eeda82aa] Exec at coord is falseI1007 > 22:10:10.700234 325 TAcceptQueueServer.cpp:340] New connection to server > hiveserver2-http-frontend from client I1007 > 22:10:10.700286 329 TAcceptQueueServer.cpp:227] TAcceptQueueServer: > hiveserver2-http-frontend started connection setup for client 172.18.0.1 Port: 51580>I1007 22:10:10.700314 329 > TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend > finished connection setup for client I1007 > 22:10:10.700371 56550 impala-hs2-server.cc:844] FetchResults(): > query_id=764d4313dbc64e20:2831560c #results=1 has_more=trueI1007 > 22:10:10.700508 56551 impala-server.cc:1969] Connection > 8249c7defcb10124:1bc65ed9ea562aab from client 172.18.0.1:51576 to server > hiveserver2-http-frontend closed. The
[jira] [Resolved] (IMPALA-8959) test_union failed with wrong results on S3
[ https://issues.apache.org/jira/browse/IMPALA-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8959. --- Resolution: Duplicate I'm pretty sure that this the same cause - I checked and the query was run via HS2, so we're exposed to the same Impyla bug. > test_union failed with wrong results on S3 > -- > > Key: IMPALA-8959 > URL: https://issues.apache.org/jira/browse/IMPALA-8959 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.4.0 >Reporter: Zoltán Borók-Nagy >Assignee: Tim Armstrong >Priority: Blocker > Labels: broken-build, flaky > > Error details > {noformat} > query_test/test_queries.py:77: in test_union > self.run_test_case('QueryTest/union', vector) > common/impala_test_suite.py:611: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:448: in __verify_results_and_errors > replace_filenames_with_placeholder) common/test_result_verifier.py:456: in > verify_raw_results VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:278: in verify_query_result_is_equal > assert expected_results == actual_results E assert Comparing > QueryTestResults (expected vs actual): E > 0,true,0,0,0,0,0,0,'01/01/09','0',2009-01-01 00:00:00,2009,1 != None E > 0,true,0,0,0,0,0,0,'01/01/09','0',2009-01-01 00:00:00,2009,1 != None E > 1,false,1,1,1,10,1.10023841858,10.1,'01/01/09','1',2009-01-01 > 00:01:00,2009,1 != None E > 1,false,1,1,1,10,1.10023841858,10.1,'01/01/09','1',2009-01-01 > 00:01:00,2009,1 != None E 2,true,0,0,0,0,0,0,'02/01/09','0',2009-02-01 > 00:00:00,2009,2 != None E 2,true,0,0,0,0,0,0,'02/01/09','0',2009-02-01 > 00:00:00,2009,2 != None E > 3,false,1,1,1,10,1.10023841858,10.1,'02/01/09','1',2009-02-01 > 00:01:00,2009,2 != None E > 3,false,1,1,1,10,1.10023841858,10.1,'02/01/09','1',2009-02-01 > 00:01:00,2009,2 != None E 4,true,0,0,0,0,0,0,'03/01/09','0',2009-03-01 > 00:00:00,2009,3 != None E > 5,false,1,1,1,10,1.10023841858,10.1,'03/01/09','1',2009-03-01 > 00:01:00,2009,3 != None E Number of rows returned (expected vs actual): > 10 != 0{noformat} > Stack trace > {noformat} > query_test/test_queries.py:77: in test_union > self.run_test_case('QueryTest/union', vector) > common/impala_test_suite.py:611: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:448: in __verify_results_and_errors > replace_filenames_with_placeholder) > common/test_result_verifier.py:456: in verify_raw_results > VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:278: in verify_query_result_is_equal > assert expected_results == actual_results > E assert Comparing QueryTestResults (expected vs actual): > E 0,true,0,0,0,0,0,0,'01/01/09','0',2009-01-01 00:00:00,2009,1 != None > E 0,true,0,0,0,0,0,0,'01/01/09','0',2009-01-01 00:00:00,2009,1 != None > E 1,false,1,1,1,10,1.10023841858,10.1,'01/01/09','1',2009-01-01 > 00:01:00,2009,1 != None > E 1,false,1,1,1,10,1.10023841858,10.1,'01/01/09','1',2009-01-01 > 00:01:00,2009,1 != None > E 2,true,0,0,0,0,0,0,'02/01/09','0',2009-02-01 00:00:00,2009,2 != None > E 2,true,0,0,0,0,0,0,'02/01/09','0',2009-02-01 00:00:00,2009,2 != None > E 3,false,1,1,1,10,1.10023841858,10.1,'02/01/09','1',2009-02-01 > 00:01:00,2009,2 != None > E 3,false,1,1,1,10,1.10023841858,10.1,'02/01/09','1',2009-02-01 > 00:01:00,2009,2 != None > E 4,true,0,0,0,0,0,0,'03/01/09','0',2009-03-01 00:00:00,2009,3 != None > E 5,false,1,1,1,10,1.10023841858,10.1,'03/01/09','1',2009-03-01 > 00:01:00,2009,3 != None > E Number of rows returned (expected vs actual): 10 != 0{noformat} > {noformat} > select id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, > float_col, double_col, date_string_col, string_col, timestamp_col, year, > month from alltypestiny where year=2009 and month=1 > union all > (select id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, > float_col, double_col, date_string_col, string_col, timestamp_col, year, > month from alltypestiny where year=2009 and month=1 >union all > (select id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, > float_col, double_col, date_string_col, string_col, timestamp_col, year, > month from alltypestiny where year=2009 and month=2 > union all > (select id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, > float_col, double_col, date_string_col, string_col, timestamp_col, year, > month from alltypestiny where year=2009 and month=2 > union all > select id, bool_col, tinyint_col,
[jira] [Resolved] (IMPALA-8959) test_union failed with wrong results on S3
[ https://issues.apache.org/jira/browse/IMPALA-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8959. --- Resolution: Duplicate I'm pretty sure that this the same cause - I checked and the query was run via HS2, so we're exposed to the same Impyla bug. > test_union failed with wrong results on S3 > -- > > Key: IMPALA-8959 > URL: https://issues.apache.org/jira/browse/IMPALA-8959 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.4.0 >Reporter: Zoltán Borók-Nagy >Assignee: Tim Armstrong >Priority: Blocker > Labels: broken-build, flaky > > Error details > {noformat} > query_test/test_queries.py:77: in test_union > self.run_test_case('QueryTest/union', vector) > common/impala_test_suite.py:611: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:448: in __verify_results_and_errors > replace_filenames_with_placeholder) common/test_result_verifier.py:456: in > verify_raw_results VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:278: in verify_query_result_is_equal > assert expected_results == actual_results E assert Comparing > QueryTestResults (expected vs actual): E > 0,true,0,0,0,0,0,0,'01/01/09','0',2009-01-01 00:00:00,2009,1 != None E > 0,true,0,0,0,0,0,0,'01/01/09','0',2009-01-01 00:00:00,2009,1 != None E > 1,false,1,1,1,10,1.10023841858,10.1,'01/01/09','1',2009-01-01 > 00:01:00,2009,1 != None E > 1,false,1,1,1,10,1.10023841858,10.1,'01/01/09','1',2009-01-01 > 00:01:00,2009,1 != None E 2,true,0,0,0,0,0,0,'02/01/09','0',2009-02-01 > 00:00:00,2009,2 != None E 2,true,0,0,0,0,0,0,'02/01/09','0',2009-02-01 > 00:00:00,2009,2 != None E > 3,false,1,1,1,10,1.10023841858,10.1,'02/01/09','1',2009-02-01 > 00:01:00,2009,2 != None E > 3,false,1,1,1,10,1.10023841858,10.1,'02/01/09','1',2009-02-01 > 00:01:00,2009,2 != None E 4,true,0,0,0,0,0,0,'03/01/09','0',2009-03-01 > 00:00:00,2009,3 != None E > 5,false,1,1,1,10,1.10023841858,10.1,'03/01/09','1',2009-03-01 > 00:01:00,2009,3 != None E Number of rows returned (expected vs actual): > 10 != 0{noformat} > Stack trace > {noformat} > query_test/test_queries.py:77: in test_union > self.run_test_case('QueryTest/union', vector) > common/impala_test_suite.py:611: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:448: in __verify_results_and_errors > replace_filenames_with_placeholder) > common/test_result_verifier.py:456: in verify_raw_results > VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:278: in verify_query_result_is_equal > assert expected_results == actual_results > E assert Comparing QueryTestResults (expected vs actual): > E 0,true,0,0,0,0,0,0,'01/01/09','0',2009-01-01 00:00:00,2009,1 != None > E 0,true,0,0,0,0,0,0,'01/01/09','0',2009-01-01 00:00:00,2009,1 != None > E 1,false,1,1,1,10,1.10023841858,10.1,'01/01/09','1',2009-01-01 > 00:01:00,2009,1 != None > E 1,false,1,1,1,10,1.10023841858,10.1,'01/01/09','1',2009-01-01 > 00:01:00,2009,1 != None > E 2,true,0,0,0,0,0,0,'02/01/09','0',2009-02-01 00:00:00,2009,2 != None > E 2,true,0,0,0,0,0,0,'02/01/09','0',2009-02-01 00:00:00,2009,2 != None > E 3,false,1,1,1,10,1.10023841858,10.1,'02/01/09','1',2009-02-01 > 00:01:00,2009,2 != None > E 3,false,1,1,1,10,1.10023841858,10.1,'02/01/09','1',2009-02-01 > 00:01:00,2009,2 != None > E 4,true,0,0,0,0,0,0,'03/01/09','0',2009-03-01 00:00:00,2009,3 != None > E 5,false,1,1,1,10,1.10023841858,10.1,'03/01/09','1',2009-03-01 > 00:01:00,2009,3 != None > E Number of rows returned (expected vs actual): 10 != 0{noformat} > {noformat} > select id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, > float_col, double_col, date_string_col, string_col, timestamp_col, year, > month from alltypestiny where year=2009 and month=1 > union all > (select id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, > float_col, double_col, date_string_col, string_col, timestamp_col, year, > month from alltypestiny where year=2009 and month=1 >union all > (select id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, > float_col, double_col, date_string_col, string_col, timestamp_col, year, > month from alltypestiny where year=2009 and month=2 > union all > (select id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, > float_col, double_col, date_string_col, string_col, timestamp_col, year, > month from alltypestiny where year=2009 and month=2 > union all > select id, bool_col, tinyint_col,
[jira] [Updated] (IMPALA-9098) TestQueries.test_union failed
[ https://issues.apache.org/jira/browse/IMPALA-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-9098: -- Priority: Blocker (was: Critical) > TestQueries.test_union failed > - > > Key: IMPALA-9098 > URL: https://issues.apache.org/jira/browse/IMPALA-9098 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.4.0 >Reporter: Andrew Sherman >Assignee: Tim Armstrong >Priority: Blocker > Labels: broken-build, flaky > Attachments: profile_correct.txt, profile_incorrect.txt > > > This happened once in an ASAN build. This *might* be a flaky test, like > IMPALA-8959 or it just might be a regression caused by IMPALA-8999 > {code} > Error Message > query_test/test_queries.py:77: in test_union > self.run_test_case('QueryTest/union', vector) > common/impala_test_suite.py:650: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:487: in __verify_results_and_errors > replace_filenames_with_placeholder) common/test_result_verifier.py:456: in > verify_raw_results VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:278: in verify_query_result_is_equal > assert expected_results == actual_results E assert Comparing > QueryTestResults (expected vs actual): E 0,0 != 1,10 E 0,0 != 2,0 E > 0,0 != 224,80 E 1,10 != 252,90 E 1,10 != 3,10 E 1000,2000 != > 4,0 E 112,40 != 4,0 E 140,50 != 5,10 E 168,60 != 6,0 E 196,70 > != 7,10 E 2,0 != None E 2,0 != None E 224,80 != None E 252,90 > != None E 28,10 != None E 3,10 != None E 3,10 != None E 4,0 > != None E 4,0 != None E 5,10 != None E 5,10 != None E 56,20 > != None E 6,0 != None E 6,0 != None E 7,10 != None E 7,10 != > None E 84,30 != None E Number of rows returned (expected vs actual): > 27 != 10 > Stacktrace > query_test/test_queries.py:77: in test_union > self.run_test_case('QueryTest/union', vector) > common/impala_test_suite.py:650: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:487: in __verify_results_and_errors > replace_filenames_with_placeholder) > common/test_result_verifier.py:456: in verify_raw_results > VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:278: in verify_query_result_is_equal > assert expected_results == actual_results > E assert Comparing QueryTestResults (expected vs actual): > E 0,0 != 1,10 > E 0,0 != 2,0 > E 0,0 != 224,80 > E 1,10 != 252,90 > E 1,10 != 3,10 > E 1000,2000 != 4,0 > E 112,40 != 4,0 > E 140,50 != 5,10 > E 168,60 != 6,0 > E 196,70 != 7,10 > E 2,0 != None > E 2,0 != None > E 224,80 != None > E 252,90 != None > E 28,10 != None > E 3,10 != None > E 3,10 != None > E 4,0 != None > E 4,0 != None > E 5,10 != None > E 5,10 != None > E 56,20 != None > E 6,0 != None > E 6,0 != None > E 7,10 != None > E 7,10 != None > E 84,30 != None > E Number of rows returned (expected vs actual): 27 != 10 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-9098) TestQueries.test_union failed
[ https://issues.apache.org/jira/browse/IMPALA-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964471#comment-16964471 ] Tim Armstrong commented on IMPALA-9098: --- I found a good and bad instance of the same query. It looks like it produced at least 16 rows, but only 10 rows were returned by the client. Suspiciously, on this run BATCH_SIZE=10. I have a theory that this is actually an Impyla bug triggered by us returning 0 rows in some cases: https://github.com/cloudera/impyla/issues/369 > TestQueries.test_union failed > - > > Key: IMPALA-9098 > URL: https://issues.apache.org/jira/browse/IMPALA-9098 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.4.0 >Reporter: Andrew Sherman >Assignee: Tim Armstrong >Priority: Critical > Labels: broken-build, flaky > Attachments: profile_correct.txt, profile_incorrect.txt > > > This happened once in an ASAN build. This *might* be a flaky test, like > IMPALA-8959 or it just might be a regression caused by IMPALA-8999 > {code} > Error Message > query_test/test_queries.py:77: in test_union > self.run_test_case('QueryTest/union', vector) > common/impala_test_suite.py:650: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:487: in __verify_results_and_errors > replace_filenames_with_placeholder) common/test_result_verifier.py:456: in > verify_raw_results VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:278: in verify_query_result_is_equal > assert expected_results == actual_results E assert Comparing > QueryTestResults (expected vs actual): E 0,0 != 1,10 E 0,0 != 2,0 E > 0,0 != 224,80 E 1,10 != 252,90 E 1,10 != 3,10 E 1000,2000 != > 4,0 E 112,40 != 4,0 E 140,50 != 5,10 E 168,60 != 6,0 E 196,70 > != 7,10 E 2,0 != None E 2,0 != None E 224,80 != None E 252,90 > != None E 28,10 != None E 3,10 != None E 3,10 != None E 4,0 > != None E 4,0 != None E 5,10 != None E 5,10 != None E 56,20 > != None E 6,0 != None E 6,0 != None E 7,10 != None E 7,10 != > None E 84,30 != None E Number of rows returned (expected vs actual): > 27 != 10 > Stacktrace > query_test/test_queries.py:77: in test_union > self.run_test_case('QueryTest/union', vector) > common/impala_test_suite.py:650: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:487: in __verify_results_and_errors > replace_filenames_with_placeholder) > common/test_result_verifier.py:456: in verify_raw_results > VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:278: in verify_query_result_is_equal > assert expected_results == actual_results > E assert Comparing QueryTestResults (expected vs actual): > E 0,0 != 1,10 > E 0,0 != 2,0 > E 0,0 != 224,80 > E 1,10 != 252,90 > E 1,10 != 3,10 > E 1000,2000 != 4,0 > E 112,40 != 4,0 > E 140,50 != 5,10 > E 168,60 != 6,0 > E 196,70 != 7,10 > E 2,0 != None > E 2,0 != None > E 224,80 != None > E 252,90 != None > E 28,10 != None > E 3,10 != None > E 3,10 != None > E 4,0 != None > E 4,0 != None > E 5,10 != None > E 5,10 != None > E 56,20 != None > E 6,0 != None > E 6,0 != None > E 7,10 != None > E 7,10 != None > E 84,30 != None > E Number of rows returned (expected vs actual): 27 != 10 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9098) TestQueries.test_union failed
[ https://issues.apache.org/jira/browse/IMPALA-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-9098: -- Attachment: profile_incorrect.txt profile_correct.txt > TestQueries.test_union failed > - > > Key: IMPALA-9098 > URL: https://issues.apache.org/jira/browse/IMPALA-9098 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.4.0 >Reporter: Andrew Sherman >Assignee: Tim Armstrong >Priority: Critical > Labels: broken-build, flaky > Attachments: profile_correct.txt, profile_incorrect.txt > > > This happened once in an ASAN build. This *might* be a flaky test, like > IMPALA-8959 or it just might be a regression caused by IMPALA-8999 > {code} > Error Message > query_test/test_queries.py:77: in test_union > self.run_test_case('QueryTest/union', vector) > common/impala_test_suite.py:650: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:487: in __verify_results_and_errors > replace_filenames_with_placeholder) common/test_result_verifier.py:456: in > verify_raw_results VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:278: in verify_query_result_is_equal > assert expected_results == actual_results E assert Comparing > QueryTestResults (expected vs actual): E 0,0 != 1,10 E 0,0 != 2,0 E > 0,0 != 224,80 E 1,10 != 252,90 E 1,10 != 3,10 E 1000,2000 != > 4,0 E 112,40 != 4,0 E 140,50 != 5,10 E 168,60 != 6,0 E 196,70 > != 7,10 E 2,0 != None E 2,0 != None E 224,80 != None E 252,90 > != None E 28,10 != None E 3,10 != None E 3,10 != None E 4,0 > != None E 4,0 != None E 5,10 != None E 5,10 != None E 56,20 > != None E 6,0 != None E 6,0 != None E 7,10 != None E 7,10 != > None E 84,30 != None E Number of rows returned (expected vs actual): > 27 != 10 > Stacktrace > query_test/test_queries.py:77: in test_union > self.run_test_case('QueryTest/union', vector) > common/impala_test_suite.py:650: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:487: in __verify_results_and_errors > replace_filenames_with_placeholder) > common/test_result_verifier.py:456: in verify_raw_results > VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:278: in verify_query_result_is_equal > assert expected_results == actual_results > E assert Comparing QueryTestResults (expected vs actual): > E 0,0 != 1,10 > E 0,0 != 2,0 > E 0,0 != 224,80 > E 1,10 != 252,90 > E 1,10 != 3,10 > E 1000,2000 != 4,0 > E 112,40 != 4,0 > E 140,50 != 5,10 > E 168,60 != 6,0 > E 196,70 != 7,10 > E 2,0 != None > E 2,0 != None > E 224,80 != None > E 252,90 != None > E 28,10 != None > E 3,10 != None > E 3,10 != None > E 4,0 != None > E 4,0 != None > E 5,10 != None > E 5,10 != None > E 56,20 != None > E 6,0 != None > E 6,0 != None > E 7,10 != None > E 7,10 != None > E 84,30 != None > E Number of rows returned (expected vs actual): 27 != 10 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-9115) "Exec at coord is" log spam
Tim Armstrong created IMPALA-9115: - Summary: "Exec at coord is" log spam Key: IMPALA-9115 URL: https://issues.apache.org/jira/browse/IMPALA-9115 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.4.0 Reporter: Tim Armstrong Assignee: Bikramjeet Vig I see a lot of this in the logs, maybe we should move it to VLOG(2)? {noformat} I1026 04:26:15.066264 119815 scheduler.cc:548] 394ed7e2a194714b:31ab5b2f] Exec at coord is false I1026 04:26:15.068248 119815 scheduler.cc:548] 394ed7e2a194714b:31ab5b2f] Exec at coord is false I1026 04:26:15.069190 119815 scheduler.cc:548] 394ed7e2a194714b:31ab5b2f] Exec at coord is false I1026 04:26:15.070245 119815 scheduler.cc:548] 394ed7e2a194714b:31ab5b2f] Exec at coord is false {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-9115) "Exec at coord is" log spam
Tim Armstrong created IMPALA-9115: - Summary: "Exec at coord is" log spam Key: IMPALA-9115 URL: https://issues.apache.org/jira/browse/IMPALA-9115 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.4.0 Reporter: Tim Armstrong Assignee: Bikramjeet Vig I see a lot of this in the logs, maybe we should move it to VLOG(2)? {noformat} I1026 04:26:15.066264 119815 scheduler.cc:548] 394ed7e2a194714b:31ab5b2f] Exec at coord is false I1026 04:26:15.068248 119815 scheduler.cc:548] 394ed7e2a194714b:31ab5b2f] Exec at coord is false I1026 04:26:15.069190 119815 scheduler.cc:548] 394ed7e2a194714b:31ab5b2f] Exec at coord is false I1026 04:26:15.070245 119815 scheduler.cc:548] 394ed7e2a194714b:31ab5b2f] Exec at coord is false {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-7356) Stress test for memory-based admission control
[ https://issues.apache.org/jira/browse/IMPALA-7356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong reassigned IMPALA-7356: - Assignee: (was: Tim Armstrong) > Stress test for memory-based admission control > -- > > Key: IMPALA-7356 > URL: https://issues.apache.org/jira/browse/IMPALA-7356 > Project: IMPALA > Issue Type: Test > Components: Infrastructure >Reporter: Tim Armstrong >Priority: Major > Labels: admission-control > > We should extend the existing stress test to have a new mode designed to test > memory-based admission control, where the stress test framework does not try > to throttle memory consumption but instead relies on Impala doing so. > The required changes would be: > * A mode to disable throttling > * Options for stricter pass conditions - queries should not fail with OOM > even if the stress test tries to submit way too many queries. > * However AC queue timeouts may be ok. > * Investigation into the logic for choosing which query to run next and when > - does that need to change? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-9114) TestKuduHMSIntegration failing: Kudu create table failing
[ https://issues.apache.org/jira/browse/IMPALA-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964454#comment-16964454 ] Hao Hao commented on IMPALA-9114: - Looking at the Kudu log, it should be a test issue that somehow TestKuduHMSIntegration ran even when the Kudu service was not running (no Kudu log around 2019-10-31 12:46). I cannot find the log to show why this happened as the test should only start when Kudu service has been [restarted|https://github.com/apache/impala/blob/master/tests/common/custom_cluster_test_suite.py#L215]. Anyway, lower the priority as it should be a test issue. > TestKuduHMSIntegration failing: Kudu create table failing > - > > Key: IMPALA-9114 > URL: https://issues.apache.org/jira/browse/IMPALA-9114 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.4.0 >Reporter: Bikramjeet Vig >Assignee: Hao Hao >Priority: Critical > Labels: broken-build > > {noformat} > Error Message > ImpalaBeeswaxException: ImpalaBeeswaxException: INNER EXCEPTION: 'beeswaxd.ttypes.BeeswaxException'> MESSAGE: AnalysisException: Cannot > analyze Kudu table 't': Error determining if Kudu's integration with the Hive > Metastore is enabled: cannot complete before timeout: > KuduRpc(method=getHiveMetastoreConfig, tablet=null, attempt=97, > TimeoutTracker(timeout=18, elapsed=178723), Trace Summary(177842 ms): > Sent(0), Received(0), Delayed(96), MasterRefresh(0), AuthRefresh(0), > Truncated: false Delayed: (UNKNOWN, [ getHiveMetastoreConfig, 96 ])) > Stacktrace > custom_cluster/test_kudu.py:150: in test_create_managed_kudu_tables > self.run_test_case('QueryTest/kudu_create', vector, > use_db=unique_database) > common/impala_test_suite.py:621: in run_test_case > result = exec_fn(query, user=test_section.get('USER', '').strip() or None) > common/impala_test_suite.py:556: in __exec_in_impala > result = self.__execute_query(target_impalad_client, query, user=user) > common/impala_test_suite.py:893: in __execute_query > return impalad_client.execute(query, user=user) > common/impala_connection.py:205: in execute > return self.__beeswax_client.execute(sql_stmt, user=user) > beeswax/impala_beeswax.py:187: in execute > handle = self.__execute_query(query_string.strip(), user=user) > beeswax/impala_beeswax.py:362: in __execute_query > handle = self.execute_query_async(query_string, user=user) > beeswax/impala_beeswax.py:356: in execute_query_async > handle = self.__do_rpc(lambda: self.imp_service.query(query,)) > beeswax/impala_beeswax.py:519: in __do_rpc > raise ImpalaBeeswaxException(self.__build_error_message(b), b) > E ImpalaBeeswaxException: ImpalaBeeswaxException: > EINNER EXCEPTION: > EMESSAGE: AnalysisException: Cannot analyze Kudu table 't': Error > determining if Kudu's integration with the Hive Metastore is enabled: cannot > complete before timeout: KuduRpc(method=getHiveMetastoreConfig, tablet=null, > attempt=97, TimeoutTracker(timeout=18, elapsed=178723), Trace > Summary(177842 ms): Sent(0), Received(0), Delayed(96), MasterRefresh(0), > AuthRefresh(0), Truncated: false > EDelayed: (UNKNOWN, [ getHiveMetastoreConfig, 96 ])) > Standard Output > Stopping kudu > Starting kudu (Web UI - http://localhost:8051) > Standard Error > -- 2019-10-31 12:46:18,353 INFO MainThread: Starting cluster with > command: > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/start-impala-cluster.py > '--state_store_args=--statestore_update_frequency_ms=50 > --statestore_priority_update_frequency_ms=50 > --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 > --log_dir=/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests > --log_level=1 '--impalad_args=-kudu_client_rpc_timeout_ms=3 ' > '--state_store_args=None ' --impalad_args=--default_query_options= > 12:46:18 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es) > 12:46:18 MainThread: Starting State Store logging to > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/statestored.INFO > 12:46:18 MainThread: Starting Catalog Service logging to > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/catalogd.INFO > 12:46:18 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad.INFO > 12:46:18 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO > 12:46:18 MainThread: Starting Impala Daemon logging to >
[jira] [Updated] (IMPALA-9114) TestKuduHMSIntegration failing: Kudu create table failing
[ https://issues.apache.org/jira/browse/IMPALA-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hao Hao updated IMPALA-9114: Priority: Minor (was: Critical) > TestKuduHMSIntegration failing: Kudu create table failing > - > > Key: IMPALA-9114 > URL: https://issues.apache.org/jira/browse/IMPALA-9114 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.4.0 >Reporter: Bikramjeet Vig >Assignee: Hao Hao >Priority: Minor > Labels: broken-build > > {noformat} > Error Message > ImpalaBeeswaxException: ImpalaBeeswaxException: INNER EXCEPTION: 'beeswaxd.ttypes.BeeswaxException'> MESSAGE: AnalysisException: Cannot > analyze Kudu table 't': Error determining if Kudu's integration with the Hive > Metastore is enabled: cannot complete before timeout: > KuduRpc(method=getHiveMetastoreConfig, tablet=null, attempt=97, > TimeoutTracker(timeout=18, elapsed=178723), Trace Summary(177842 ms): > Sent(0), Received(0), Delayed(96), MasterRefresh(0), AuthRefresh(0), > Truncated: false Delayed: (UNKNOWN, [ getHiveMetastoreConfig, 96 ])) > Stacktrace > custom_cluster/test_kudu.py:150: in test_create_managed_kudu_tables > self.run_test_case('QueryTest/kudu_create', vector, > use_db=unique_database) > common/impala_test_suite.py:621: in run_test_case > result = exec_fn(query, user=test_section.get('USER', '').strip() or None) > common/impala_test_suite.py:556: in __exec_in_impala > result = self.__execute_query(target_impalad_client, query, user=user) > common/impala_test_suite.py:893: in __execute_query > return impalad_client.execute(query, user=user) > common/impala_connection.py:205: in execute > return self.__beeswax_client.execute(sql_stmt, user=user) > beeswax/impala_beeswax.py:187: in execute > handle = self.__execute_query(query_string.strip(), user=user) > beeswax/impala_beeswax.py:362: in __execute_query > handle = self.execute_query_async(query_string, user=user) > beeswax/impala_beeswax.py:356: in execute_query_async > handle = self.__do_rpc(lambda: self.imp_service.query(query,)) > beeswax/impala_beeswax.py:519: in __do_rpc > raise ImpalaBeeswaxException(self.__build_error_message(b), b) > E ImpalaBeeswaxException: ImpalaBeeswaxException: > EINNER EXCEPTION: > EMESSAGE: AnalysisException: Cannot analyze Kudu table 't': Error > determining if Kudu's integration with the Hive Metastore is enabled: cannot > complete before timeout: KuduRpc(method=getHiveMetastoreConfig, tablet=null, > attempt=97, TimeoutTracker(timeout=18, elapsed=178723), Trace > Summary(177842 ms): Sent(0), Received(0), Delayed(96), MasterRefresh(0), > AuthRefresh(0), Truncated: false > EDelayed: (UNKNOWN, [ getHiveMetastoreConfig, 96 ])) > Standard Output > Stopping kudu > Starting kudu (Web UI - http://localhost:8051) > Standard Error > -- 2019-10-31 12:46:18,353 INFO MainThread: Starting cluster with > command: > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/start-impala-cluster.py > '--state_store_args=--statestore_update_frequency_ms=50 > --statestore_priority_update_frequency_ms=50 > --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 > --log_dir=/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests > --log_level=1 '--impalad_args=-kudu_client_rpc_timeout_ms=3 ' > '--state_store_args=None ' --impalad_args=--default_query_options= > 12:46:18 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es) > 12:46:18 MainThread: Starting State Store logging to > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/statestored.INFO > 12:46:18 MainThread: Starting Catalog Service logging to > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/catalogd.INFO > 12:46:18 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad.INFO > 12:46:18 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO > 12:46:18 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO > 12:46:21 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) > 12:46:21 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) > 12:46:21 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-m5-4xlarge-ondemand-0645.vpc.cloudera.com:25000 > 12:46:22 MainThread: Waiting for num_known_live_backends=3. Current value: 0 > 12:46:23 MainThread: Found 3 impalad/1
[jira] [Closed] (IMPALA-8768) Clarifying the conditions in which audit logs record a query
[ https://issues.apache.org/jira/browse/IMPALA-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexandra Rodoni closed IMPALA-8768. Fix Version/s: Impala 3.4.0 Resolution: Fixed > Clarifying the conditions in which audit logs record a query > > > Key: IMPALA-8768 > URL: https://issues.apache.org/jira/browse/IMPALA-8768 > Project: IMPALA > Issue Type: Task > Components: Docs >Affects Versions: Impala 2.13.0, Impala 3.3.0 >Reporter: Vincent Tran >Assignee: Alexandra Rodoni >Priority: Minor > Fix For: Impala 3.4.0 > > > Currently, Impala documentation highlights the following cases as operations > which the audit logs will record: > {noformat} > Which Operations Are Audited > The kinds of SQL queries represented in the audit log are: > Queries that are prevented due to lack of authorization. > Queries that Impala can analyze and parse to determine that they are > authorized. The audit data is recorded immediately after Impala finishes its > analysis, before the query is actually executed. > The audit log does not contain entries for queries that could not be parsed > and analyzed. For example, a query that fails due to a syntax error is not > recorded in the audit log. The audit log also does not contain queries that > fail due to a reference to a table that does not exist, if you would be > authorized to access the table if it did exist. > Certain statements in the impala-shell interpreter, such as CONNECT, SUMMARY, > PROFILE, SET, and QUIT, do not correspond to actual SQL queries, and these > statements are not reflected in the audit log. > {noformat} > However, based on[1], there is an unmentioned condition that the client must > have issued at least one fetch for analyzed queries to be recorded in audit > logs. > [1] > https://github.com/apache/impala/blob/b3b00da1a1c7b98e84debe11c10258c4a0dff944/be/src/service/impala-server.cc#L690-L734 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (IMPALA-8768) Clarifying the conditions in which audit logs record a query
[ https://issues.apache.org/jira/browse/IMPALA-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexandra Rodoni closed IMPALA-8768. Fix Version/s: Impala 3.4.0 Resolution: Fixed > Clarifying the conditions in which audit logs record a query > > > Key: IMPALA-8768 > URL: https://issues.apache.org/jira/browse/IMPALA-8768 > Project: IMPALA > Issue Type: Task > Components: Docs >Affects Versions: Impala 2.13.0, Impala 3.3.0 >Reporter: Vincent Tran >Assignee: Alexandra Rodoni >Priority: Minor > Fix For: Impala 3.4.0 > > > Currently, Impala documentation highlights the following cases as operations > which the audit logs will record: > {noformat} > Which Operations Are Audited > The kinds of SQL queries represented in the audit log are: > Queries that are prevented due to lack of authorization. > Queries that Impala can analyze and parse to determine that they are > authorized. The audit data is recorded immediately after Impala finishes its > analysis, before the query is actually executed. > The audit log does not contain entries for queries that could not be parsed > and analyzed. For example, a query that fails due to a syntax error is not > recorded in the audit log. The audit log also does not contain queries that > fail due to a reference to a table that does not exist, if you would be > authorized to access the table if it did exist. > Certain statements in the impala-shell interpreter, such as CONNECT, SUMMARY, > PROFILE, SET, and QUIT, do not correspond to actual SQL queries, and these > statements are not reflected in the audit log. > {noformat} > However, based on[1], there is an unmentioned condition that the client must > have issued at least one fetch for analyzed queries to be recorded in audit > logs. > [1] > https://github.com/apache/impala/blob/b3b00da1a1c7b98e84debe11c10258c4a0dff944/be/src/service/impala-server.cc#L690-L734 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8768) Clarifying the conditions in which audit logs record a query
[ https://issues.apache.org/jira/browse/IMPALA-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964435#comment-16964435 ] Alexandra Rodoni commented on IMPALA-8768: -- https://gerrit.cloudera.org/#/c/14575/ > Clarifying the conditions in which audit logs record a query > > > Key: IMPALA-8768 > URL: https://issues.apache.org/jira/browse/IMPALA-8768 > Project: IMPALA > Issue Type: Task > Components: Docs >Affects Versions: Impala 2.13.0, Impala 3.3.0 >Reporter: Vincent Tran >Assignee: Alexandra Rodoni >Priority: Minor > > Currently, Impala documentation highlights the following cases as operations > which the audit logs will record: > {noformat} > Which Operations Are Audited > The kinds of SQL queries represented in the audit log are: > Queries that are prevented due to lack of authorization. > Queries that Impala can analyze and parse to determine that they are > authorized. The audit data is recorded immediately after Impala finishes its > analysis, before the query is actually executed. > The audit log does not contain entries for queries that could not be parsed > and analyzed. For example, a query that fails due to a syntax error is not > recorded in the audit log. The audit log also does not contain queries that > fail due to a reference to a table that does not exist, if you would be > authorized to access the table if it did exist. > Certain statements in the impala-shell interpreter, such as CONNECT, SUMMARY, > PROFILE, SET, and QUIT, do not correspond to actual SQL queries, and these > statements are not reflected in the audit log. > {noformat} > However, based on[1], there is an unmentioned condition that the client must > have issued at least one fetch for analyzed queries to be recorded in audit > logs. > [1] > https://github.com/apache/impala/blob/b3b00da1a1c7b98e84debe11c10258c4a0dff944/be/src/service/impala-server.cc#L690-L734 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-9085) Impala Doc: Refactor impala_s3.html
[ https://issues.apache.org/jira/browse/IMPALA-9085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-9085 started by Alexandra Rodoni. > Impala Doc: Refactor impala_s3.html > --- > > Key: IMPALA-9085 > URL: https://issues.apache.org/jira/browse/IMPALA-9085 > Project: IMPALA > Issue Type: Bug > Components: Docs >Reporter: Alexandra Rodoni >Assignee: Alexandra Rodoni >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-9073) Failed test during pre-commit: custom_cluster.test_executor_groups.TestExecutorGroups.test_executor_concurrency
[ https://issues.apache.org/jira/browse/IMPALA-9073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964428#comment-16964428 ] Tim Armstrong commented on IMPALA-9073: --- I think this is because of IMPALA-8803 - releasing resources per-backend. It is possible to have > 3 concurrent queries even with only 3 slots per backend if the previous queries had released resources on those backends. I think the test should actually be checking the number of queries on each backend. {noformat} I1024 22:24:48.146070 106355 admission-controller.cc:1303] Stats: agg_num_running=3, agg_num_queued=2, agg_mem_reserved=815.26 KB, local_host(local_mem_admitted=300.05 MB, num_admitted_running=3, num_queued=2, backend_mem_reserved=75.87 KB) I1024 22:24:48.146086 106355 admission-controller.cc:1509] Could not dequeue query id=3e4bce299fd907b1:f92ff5cf reason: Not enough admission control slots available on host ip-172-31-20-105:22000. Needed 1 slots but 3/3 are already in use. I1024 22:24:48.179689 107036 coordinator.cc:508] ExecState: query id=f9492c4fae791967:35b096ec execution completed I1024 22:24:48.179709 107036 coordinator.cc:644] Coordinator waiting for backends to finish, 1 remaining. query_id=f9492c4fae791967:35b096ec I1024 22:24:48.179702 107084 krpc-data-stream-mgr.cc:298] f9492c4fae791967:35b096ec] DeregisterRecvr(): fragment_instance_id=f9492c4fae791967:35b096ec, node=1 I1024 22:24:48.179896 107084 query-state.cc:652] f9492c4fae791967:35b096ec] Instance completed. instance_id=f9492c4fae791967:35b096ec #in-flight=2 status=OK I1024 22:24:48.179916 107081 query-state.cc:287] f9492c4fae791967:35b096ec] UpdateBackendExecState(): last report for f9492c4fae791967:35b096ec I1024 22:24:48.179930 107088 krpc-data-stream-mgr.cc:298] e147d5691862e504:f48934e7] DeregisterRecvr(): fragment_instance_id=e147d5691862e504:f48934e7, node=1 I1024 22:24:48.179937 107038 coordinator.cc:508] ExecState: query id=e147d5691862e504:f48934e7 execution completed I1024 22:24:48.179980 107038 coordinator.cc:644] Coordinator waiting for backends to finish, 1 remaining. query_id=e147d5691862e504:f48934e7 I1024 22:24:48.180224 107088 query-state.cc:652] e147d5691862e504:f48934e7] Instance completed. instance_id=e147d5691862e504:f48934e7 #in-flight=1 status=OK I1024 22:24:48.180279 107083 query-state.cc:287] e147d5691862e504:f48934e7] UpdateBackendExecState(): last report for e147d5691862e504:f48934e7 I1024 22:24:48.180637 107035 coordinator.cc:508] ExecState: query id=1c48aa10421e20d8:677b55c4 execution completed I1024 22:24:48.180656 107089 krpc-data-stream-mgr.cc:298] 1c48aa10421e20d8:677b55c4] DeregisterRecvr(): fragment_instance_id=1c48aa10421e20d8:677b55c4, node=1 I1024 22:24:48.180670 107035 coordinator.cc:644] Coordinator waiting for backends to finish, 1 remaining. query_id=1c48aa10421e20d8:677b55c4 I1024 22:24:48.180932 107089 query-state.cc:652] 1c48aa10421e20d8:677b55c4] Instance completed. instance_id=1c48aa10421e20d8:677b55c4 #in-flight=0 status=OK I1024 22:24:48.180950 107082 query-state.cc:287] 1c48aa10421e20d8:677b55c4] UpdateBackendExecState(): last report for 1c48aa10421e20d8:677b55c4 I1024 22:24:48.181324 106187 coordinator.cc:768] Backend completed: host=ip-172-31-20-105:22000 remaining=1 query_id=f9492c4fae791967:35b096ec I1024 22:24:48.181390 107036 coordinator.cc:960] Release admission control resources for query_id=f9492c4fae791967:35b096ec I1024 22:24:48.181421 106355 admission-controller.cc:1291] Trying to admit id=3e4bce299fd907b1:f92ff5cf in pool_name=default-pool executor_group_name=default-pool-group1 per_host_mem_estimate=176.02 MB dedicated_coord_mem_estimate=100.02 MB max_requests=-1 (configured statically) max_queued=200 (configured statically) max_mem=-1.00 B (configured statically) I1024 22:24:48.181447 106355 admission-controller.cc:1303] Stats: agg_num_running=3, agg_num_queued=2, agg_mem_reserved=815.26 KB, local_host(local_mem_admitted=200.03 MB, num_admitted_running=3, num_queued=2, backend_mem_reserved=75.87 KB) I1024 22:24:48.181470 106355 admission-controller.cc:1443] Admitting from queue: query=3e4bce299fd907b1:f92ff5cf I1024 22:24:48.181493 107081 query-state.cc:448] f9492c4fae791967:35b096ec] Cancelling fragment instances as directed by the coordinator. Returned status: Cancelled I1024 22:24:48.181511 107081 query-state.cc:669] f9492c4fae791967:35b096ec] Cancel: query_id=f9492c4fae791967:35b096ec I1024 22:24:48.181520 107081 krpc-data-stream-mgr.cc:329] f9492c4fae791967:35b096ec] cancelling active streams for fragment_instance_id=f9492c4fae791967:35b096ec I1024 22:24:48.181553 106355 admission-controller.cc:1291]
[jira] [Commented] (IMPALA-9098) TestQueries.test_union failed
[ https://issues.apache.org/jira/browse/IMPALA-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964420#comment-16964420 ] Tim Armstrong commented on IMPALA-9098: --- No luck reproducing this. > TestQueries.test_union failed > - > > Key: IMPALA-9098 > URL: https://issues.apache.org/jira/browse/IMPALA-9098 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.4.0 >Reporter: Andrew Sherman >Assignee: Tim Armstrong >Priority: Critical > Labels: broken-build, flaky > > This happened once in an ASAN build. This *might* be a flaky test, like > IMPALA-8959 or it just might be a regression caused by IMPALA-8999 > {code} > Error Message > query_test/test_queries.py:77: in test_union > self.run_test_case('QueryTest/union', vector) > common/impala_test_suite.py:650: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:487: in __verify_results_and_errors > replace_filenames_with_placeholder) common/test_result_verifier.py:456: in > verify_raw_results VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:278: in verify_query_result_is_equal > assert expected_results == actual_results E assert Comparing > QueryTestResults (expected vs actual): E 0,0 != 1,10 E 0,0 != 2,0 E > 0,0 != 224,80 E 1,10 != 252,90 E 1,10 != 3,10 E 1000,2000 != > 4,0 E 112,40 != 4,0 E 140,50 != 5,10 E 168,60 != 6,0 E 196,70 > != 7,10 E 2,0 != None E 2,0 != None E 224,80 != None E 252,90 > != None E 28,10 != None E 3,10 != None E 3,10 != None E 4,0 > != None E 4,0 != None E 5,10 != None E 5,10 != None E 56,20 > != None E 6,0 != None E 6,0 != None E 7,10 != None E 7,10 != > None E 84,30 != None E Number of rows returned (expected vs actual): > 27 != 10 > Stacktrace > query_test/test_queries.py:77: in test_union > self.run_test_case('QueryTest/union', vector) > common/impala_test_suite.py:650: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:487: in __verify_results_and_errors > replace_filenames_with_placeholder) > common/test_result_verifier.py:456: in verify_raw_results > VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:278: in verify_query_result_is_equal > assert expected_results == actual_results > E assert Comparing QueryTestResults (expected vs actual): > E 0,0 != 1,10 > E 0,0 != 2,0 > E 0,0 != 224,80 > E 1,10 != 252,90 > E 1,10 != 3,10 > E 1000,2000 != 4,0 > E 112,40 != 4,0 > E 140,50 != 5,10 > E 168,60 != 6,0 > E 196,70 != 7,10 > E 2,0 != None > E 2,0 != None > E 224,80 != None > E 252,90 != None > E 28,10 != None > E 3,10 != None > E 3,10 != None > E 4,0 != None > E 4,0 != None > E 5,10 != None > E 5,10 != None > E 56,20 != None > E 6,0 != None > E 6,0 != None > E 7,10 != None > E 7,10 != None > E 84,30 != None > E Number of rows returned (expected vs actual): 27 != 10 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8557) Impala on ABFS failed with error "IllegalArgumentException: ABFS does not allow files or directories to end with a dot."
[ https://issues.apache.org/jira/browse/IMPALA-8557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964418#comment-16964418 ] Sahil Takiar commented on IMPALA-8557: -- I can think of two solutions: (1) add "txt" to all written text files, or (2) if a file_extension is not specified, then remove the final dot at the end of the file. If we implement option #1, we should just implement #2 as well to make the code more defensive / prevent this from happening in the future if we add support for writing additional file types. > Impala on ABFS failed with error "IllegalArgumentException: ABFS does not > allow files or directories to end with a dot." > > > Key: IMPALA-8557 > URL: https://issues.apache.org/jira/browse/IMPALA-8557 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Eric Lin >Assignee: Sahil Takiar >Priority: Major > > HDFS introduced below feature to stop users from creating a file that ends > with "." on ABFS: > https://issues.apache.org/jira/browse/HADOOP-15860 > As a result of this change, Impala now writes to ABFS fails with such error. > I can see that it generates temp file using this format "$0.$1.$2": > https://github.com/cloudera/Impala/blob/cdh6.2.0/be/src/exec/hdfs-table-sink.cc#L329 > $2 is the file extension and will be empty if it is TEXT file format: > https://github.com/cloudera/Impala/blob/cdh6.2.0/be/src/exec/hdfs-text-table-writer.cc#L65 > Since HADOOP-15860 was backported into CDH6.2, it is currently only affecting > 6.2 and works in older versions. > There is no way to override this empty file extension so no workaround is > possible, unless user choose another file format. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8557) Impala on ABFS failed with error "IllegalArgumentException: ABFS does not allow files or directories to end with a dot."
[ https://issues.apache.org/jira/browse/IMPALA-8557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar reassigned IMPALA-8557: Assignee: Sahil Takiar > Impala on ABFS failed with error "IllegalArgumentException: ABFS does not > allow files or directories to end with a dot." > > > Key: IMPALA-8557 > URL: https://issues.apache.org/jira/browse/IMPALA-8557 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Eric Lin >Assignee: Sahil Takiar >Priority: Major > > HDFS introduced below feature to stop users from creating a file that ends > with "." on ABFS: > https://issues.apache.org/jira/browse/HADOOP-15860 > As a result of this change, Impala now writes to ABFS fails with such error. > I can see that it generates temp file using this format "$0.$1.$2": > https://github.com/cloudera/Impala/blob/cdh6.2.0/be/src/exec/hdfs-table-sink.cc#L329 > $2 is the file extension and will be empty if it is TEXT file format: > https://github.com/cloudera/Impala/blob/cdh6.2.0/be/src/exec/hdfs-text-table-writer.cc#L65 > Since HADOOP-15860 was backported into CDH6.2, it is currently only affecting > 6.2 and works in older versions. > There is no way to override this empty file extension so no workaround is > possible, unless user choose another file format. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9048) Impala Doc: Document the global INVALIDATE METADATA on fetch-on-demand impalad
[ https://issues.apache.org/jira/browse/IMPALA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexandra Rodoni updated IMPALA-9048: - Description: The feature code review: https://gerrit.cloudera.org/#/c/14307/ > Impala Doc: Document the global INVALIDATE METADATA on fetch-on-demand impalad > -- > > Key: IMPALA-9048 > URL: https://issues.apache.org/jira/browse/IMPALA-9048 > Project: IMPALA > Issue Type: Task > Components: Docs >Reporter: Alexandra Rodoni >Assignee: Alexandra Rodoni >Priority: Major > Labels: future_release_doc, in_34 > > The feature code review: https://gerrit.cloudera.org/#/c/14307/ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8755) Implement Z-ordering for Impala
[ https://issues.apache.org/jira/browse/IMPALA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964414#comment-16964414 ] Alexandra Rodoni commented on IMPALA-8755: -- [~norbertluksa] [~boroknagyz] Should this be documented now for 3.4? Or should we wait until the custom sorting with Z-Order is implemented before documenting this? > Implement Z-ordering for Impala > --- > > Key: IMPALA-8755 > URL: https://issues.apache.org/jira/browse/IMPALA-8755 > Project: IMPALA > Issue Type: New Feature >Reporter: Zoltán Borók-Nagy >Assignee: Norbert Luksa >Priority: Major > > Implement Z-ordering for Impala: [https://en.wikipedia.org/wiki/Z-order_curve] > A Z-order curve defines an ordering on multi-dimensional data. Data sorted > that way can be efficiently filtered by min/max statistics regarding to the > columns participating in the ordering. > Impala currently only supports lexicographic ordering via the SORT BY clause. > This strongly prefers the first column, i.e. given the "SORT BY A, B, C" > clause => A will be totally ordered (hence filtering on A will be very > efficient), but values belonging to B and C will be scattered throughout the > data set (hence filtering on B or C will barely do any good). > We could add a new clause, e.g. a "ZSORT BY" clause to Impala that writes the > data in Z-order. > "ZSORT BY A, B C" would cluster the rows in a way that filtering on A, B, or > C would be equally efficient. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-9114) TestKuduHMSIntegration failing: Kudu create table failing
Bikramjeet Vig created IMPALA-9114: -- Summary: TestKuduHMSIntegration failing: Kudu create table failing Key: IMPALA-9114 URL: https://issues.apache.org/jira/browse/IMPALA-9114 Project: IMPALA Issue Type: Bug Affects Versions: Impala 3.4.0 Reporter: Bikramjeet Vig Assignee: Hao Hao {noformat} Error Message ImpalaBeeswaxException: ImpalaBeeswaxException: INNER EXCEPTION: MESSAGE: AnalysisException: Cannot analyze Kudu table 't': Error determining if Kudu's integration with the Hive Metastore is enabled: cannot complete before timeout: KuduRpc(method=getHiveMetastoreConfig, tablet=null, attempt=97, TimeoutTracker(timeout=18, elapsed=178723), Trace Summary(177842 ms): Sent(0), Received(0), Delayed(96), MasterRefresh(0), AuthRefresh(0), Truncated: false Delayed: (UNKNOWN, [ getHiveMetastoreConfig, 96 ])) Stacktrace custom_cluster/test_kudu.py:150: in test_create_managed_kudu_tables self.run_test_case('QueryTest/kudu_create', vector, use_db=unique_database) common/impala_test_suite.py:621: in run_test_case result = exec_fn(query, user=test_section.get('USER', '').strip() or None) common/impala_test_suite.py:556: in __exec_in_impala result = self.__execute_query(target_impalad_client, query, user=user) common/impala_test_suite.py:893: in __execute_query return impalad_client.execute(query, user=user) common/impala_connection.py:205: in execute return self.__beeswax_client.execute(sql_stmt, user=user) beeswax/impala_beeswax.py:187: in execute handle = self.__execute_query(query_string.strip(), user=user) beeswax/impala_beeswax.py:362: in __execute_query handle = self.execute_query_async(query_string, user=user) beeswax/impala_beeswax.py:356: in execute_query_async handle = self.__do_rpc(lambda: self.imp_service.query(query,)) beeswax/impala_beeswax.py:519: in __do_rpc raise ImpalaBeeswaxException(self.__build_error_message(b), b) E ImpalaBeeswaxException: ImpalaBeeswaxException: EINNER EXCEPTION: EMESSAGE: AnalysisException: Cannot analyze Kudu table 't': Error determining if Kudu's integration with the Hive Metastore is enabled: cannot complete before timeout: KuduRpc(method=getHiveMetastoreConfig, tablet=null, attempt=97, TimeoutTracker(timeout=18, elapsed=178723), Trace Summary(177842 ms): Sent(0), Received(0), Delayed(96), MasterRefresh(0), AuthRefresh(0), Truncated: false EDelayed: (UNKNOWN, [ getHiveMetastoreConfig, 96 ])) Standard Output Stopping kudu Starting kudu (Web UI - http://localhost:8051) Standard Error -- 2019-10-31 12:46:18,353 INFO MainThread: Starting cluster with command: /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/start-impala-cluster.py '--state_store_args=--statestore_update_frequency_ms=50 --statestore_priority_update_frequency_ms=50 --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 --log_dir=/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests --log_level=1 '--impalad_args=-kudu_client_rpc_timeout_ms=3 ' '--state_store_args=None ' --impalad_args=--default_query_options= 12:46:18 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es) 12:46:18 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/statestored.INFO 12:46:18 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/catalogd.INFO 12:46:18 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad.INFO 12:46:18 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO 12:46:18 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO 12:46:21 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 12:46:21 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 12:46:21 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0645.vpc.cloudera.com:25000 12:46:22 MainThread: Waiting for num_known_live_backends=3. Current value: 0 12:46:23 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 12:46:23 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0645.vpc.cloudera.com:25000 12:46:23 MainThread: Waiting for num_known_live_backends=3. Current value: 0 12:46:24 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 12:46:24 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0645.vpc.cloudera.com:25000
[jira] [Created] (IMPALA-9114) TestKuduHMSIntegration failing: Kudu create table failing
Bikramjeet Vig created IMPALA-9114: -- Summary: TestKuduHMSIntegration failing: Kudu create table failing Key: IMPALA-9114 URL: https://issues.apache.org/jira/browse/IMPALA-9114 Project: IMPALA Issue Type: Bug Affects Versions: Impala 3.4.0 Reporter: Bikramjeet Vig Assignee: Hao Hao {noformat} Error Message ImpalaBeeswaxException: ImpalaBeeswaxException: INNER EXCEPTION: MESSAGE: AnalysisException: Cannot analyze Kudu table 't': Error determining if Kudu's integration with the Hive Metastore is enabled: cannot complete before timeout: KuduRpc(method=getHiveMetastoreConfig, tablet=null, attempt=97, TimeoutTracker(timeout=18, elapsed=178723), Trace Summary(177842 ms): Sent(0), Received(0), Delayed(96), MasterRefresh(0), AuthRefresh(0), Truncated: false Delayed: (UNKNOWN, [ getHiveMetastoreConfig, 96 ])) Stacktrace custom_cluster/test_kudu.py:150: in test_create_managed_kudu_tables self.run_test_case('QueryTest/kudu_create', vector, use_db=unique_database) common/impala_test_suite.py:621: in run_test_case result = exec_fn(query, user=test_section.get('USER', '').strip() or None) common/impala_test_suite.py:556: in __exec_in_impala result = self.__execute_query(target_impalad_client, query, user=user) common/impala_test_suite.py:893: in __execute_query return impalad_client.execute(query, user=user) common/impala_connection.py:205: in execute return self.__beeswax_client.execute(sql_stmt, user=user) beeswax/impala_beeswax.py:187: in execute handle = self.__execute_query(query_string.strip(), user=user) beeswax/impala_beeswax.py:362: in __execute_query handle = self.execute_query_async(query_string, user=user) beeswax/impala_beeswax.py:356: in execute_query_async handle = self.__do_rpc(lambda: self.imp_service.query(query,)) beeswax/impala_beeswax.py:519: in __do_rpc raise ImpalaBeeswaxException(self.__build_error_message(b), b) E ImpalaBeeswaxException: ImpalaBeeswaxException: EINNER EXCEPTION: EMESSAGE: AnalysisException: Cannot analyze Kudu table 't': Error determining if Kudu's integration with the Hive Metastore is enabled: cannot complete before timeout: KuduRpc(method=getHiveMetastoreConfig, tablet=null, attempt=97, TimeoutTracker(timeout=18, elapsed=178723), Trace Summary(177842 ms): Sent(0), Received(0), Delayed(96), MasterRefresh(0), AuthRefresh(0), Truncated: false EDelayed: (UNKNOWN, [ getHiveMetastoreConfig, 96 ])) Standard Output Stopping kudu Starting kudu (Web UI - http://localhost:8051) Standard Error -- 2019-10-31 12:46:18,353 INFO MainThread: Starting cluster with command: /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/start-impala-cluster.py '--state_store_args=--statestore_update_frequency_ms=50 --statestore_priority_update_frequency_ms=50 --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 --log_dir=/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests --log_level=1 '--impalad_args=-kudu_client_rpc_timeout_ms=3 ' '--state_store_args=None ' --impalad_args=--default_query_options= 12:46:18 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es) 12:46:18 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/statestored.INFO 12:46:18 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/catalogd.INFO 12:46:18 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad.INFO 12:46:18 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO 12:46:18 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO 12:46:21 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 12:46:21 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 12:46:21 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0645.vpc.cloudera.com:25000 12:46:22 MainThread: Waiting for num_known_live_backends=3. Current value: 0 12:46:23 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 12:46:23 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0645.vpc.cloudera.com:25000 12:46:23 MainThread: Waiting for num_known_live_backends=3. Current value: 0 12:46:24 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 12:46:24 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0645.vpc.cloudera.com:25000
[jira] [Created] (IMPALA-9113) Queries can hang if an impalad is killed after a query has FINISHED
Sahil Takiar created IMPALA-9113: Summary: Queries can hang if an impalad is killed after a query has FINISHED Key: IMPALA-9113 URL: https://issues.apache.org/jira/browse/IMPALA-9113 Project: IMPALA Issue Type: Bug Components: Backend, Clients Reporter: Sahil Takiar Assignee: Sahil Takiar There is a race condition in the query coordination code that could cause queries to hang indefinitely in an un-cancellable state if an impalad crashes after the query has transitioned to the FINISHED state, but before all backends have completed. The issue occurs if: * A query produces all results * A client issues a fetch request to read all of those results * The client fetch request fetches all available rows (e.g. eos is hit) * {{Coordinator::GetNext}} then calls {{SetNonErrorTerminalState(ExecState::RETURNED_RESULTS)}} which eventually calls {{WaitForBackends()}} * {{WaitForBackends()}} will block until all backends have completed * One of the impalads running the query crashes, and thus never reports success for the query fragment it was running * The {{WaitForBackends()}} call will then block indefinitely * Any attempt to cancel the query fails because the original fetch request that drove the {{WaitForBackends()}} call has acquired the {{ClientRequestState}} lock, which thus prevents any cancellation from occurring. Implementing IMPALA-6984 should theoretically fix because as soon as eos is hit, it would call {{CancelBackends()}} rather than {{WaitForBackends()}}. Another solution would be to add a timeout to the {{WaitForBackends()}} so that it returns after the timeout is hit, this would force the fetch request to return 0 rows with {{hasMoreRows=true}}, and unblock any cancellation threads. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-9113) Queries can hang if an impalad is killed after a query has FINISHED
Sahil Takiar created IMPALA-9113: Summary: Queries can hang if an impalad is killed after a query has FINISHED Key: IMPALA-9113 URL: https://issues.apache.org/jira/browse/IMPALA-9113 Project: IMPALA Issue Type: Bug Components: Backend, Clients Reporter: Sahil Takiar Assignee: Sahil Takiar There is a race condition in the query coordination code that could cause queries to hang indefinitely in an un-cancellable state if an impalad crashes after the query has transitioned to the FINISHED state, but before all backends have completed. The issue occurs if: * A query produces all results * A client issues a fetch request to read all of those results * The client fetch request fetches all available rows (e.g. eos is hit) * {{Coordinator::GetNext}} then calls {{SetNonErrorTerminalState(ExecState::RETURNED_RESULTS)}} which eventually calls {{WaitForBackends()}} * {{WaitForBackends()}} will block until all backends have completed * One of the impalads running the query crashes, and thus never reports success for the query fragment it was running * The {{WaitForBackends()}} call will then block indefinitely * Any attempt to cancel the query fails because the original fetch request that drove the {{WaitForBackends()}} call has acquired the {{ClientRequestState}} lock, which thus prevents any cancellation from occurring. Implementing IMPALA-6984 should theoretically fix because as soon as eos is hit, it would call {{CancelBackends()}} rather than {{WaitForBackends()}}. Another solution would be to add a timeout to the {{WaitForBackends()}} so that it returns after the timeout is hit, this would force the fetch request to return 0 rows with {{hasMoreRows=true}}, and unblock any cancellation threads. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (IMPALA-4400) Aggregate runtime filters locally
[ https://issues.apache.org/jira/browse/IMPALA-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-4400 started by Tim Armstrong. - > Aggregate runtime filters locally > - > > Key: IMPALA-4400 > URL: https://issues.apache.org/jira/browse/IMPALA-4400 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Affects Versions: Impala 2.8.0 >Reporter: Marcel Kinard >Assignee: Tim Armstrong >Priority: Major > > At the moment, runtime filters are sent from each fragment instance directly > to the coordinator for aggregation (ORing) at the coordinator. > With multi-threaded execution, we will have an order of magnitude more > fragment instances per node, at which point the coordinator would become a > bottleneck during the aggregation process. To avoid that, we need to > aggregate the local instances' runtime filters at each node before sending > the filter off to the coordinator. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-9065) Fix cancellation of RuntimeFilter::WaitForArrival()
[ https://issues.apache.org/jira/browse/IMPALA-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-9065. --- Fix Version/s: Impala 3.4.0 Resolution: Fixed > Fix cancellation of RuntimeFilter::WaitForArrival() > --- > > Key: IMPALA-9065 > URL: https://issues.apache.org/jira/browse/IMPALA-9065 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 3.4.0 > > > Proper cancellation wasn't ever implemented for this code path, so if the > wait time is set high, threads can get blocked indefinitely even if the > coordinator cancelled the query. > I don't think it's hard to do the right thing - signal the filter and wake > up the thread when the finstance is cancelled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-9065) Fix cancellation of RuntimeFilter::WaitForArrival()
[ https://issues.apache.org/jira/browse/IMPALA-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-9065. --- Fix Version/s: Impala 3.4.0 Resolution: Fixed > Fix cancellation of RuntimeFilter::WaitForArrival() > --- > > Key: IMPALA-9065 > URL: https://issues.apache.org/jira/browse/IMPALA-9065 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 3.4.0 > > > Proper cancellation wasn't ever implemented for this code path, so if the > wait time is set high, threads can get blocked indefinitely even if the > coordinator cancelled the query. > I don't think it's hard to do the right thing - signal the filter and wake > up the thread when the finstance is cancelled. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IMPALA-9108) Unused leveldbjni dependency triggers some security scanners
[ https://issues.apache.org/jira/browse/IMPALA-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-9108. --- Fix Version/s: Impala 3.4.0 Resolution: Fixed > Unused leveldbjni dependency triggers some security scanners > > > Key: IMPALA-9108 > URL: https://issues.apache.org/jira/browse/IMPALA-9108 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Critical > Fix For: Impala 3.4.0 > > > A windows dll in leveldbjni-all-1.8.jar is flagged by some security scanners. > We shouldn't have a dependency on leveldb, so we should exclude this and not > pull in the jar. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-9108) Unused leveldbjni dependency triggers some security scanners
[ https://issues.apache.org/jira/browse/IMPALA-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-9108. --- Fix Version/s: Impala 3.4.0 Resolution: Fixed > Unused leveldbjni dependency triggers some security scanners > > > Key: IMPALA-9108 > URL: https://issues.apache.org/jira/browse/IMPALA-9108 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Critical > Fix For: Impala 3.4.0 > > > A windows dll in leveldbjni-all-1.8.jar is flagged by some security scanners. > We shouldn't have a dependency on leveldb, so we should exclude this and not > pull in the jar. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IMPALA-9108) Unused leveldbjni dependency triggers some security scanners
[ https://issues.apache.org/jira/browse/IMPALA-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964245#comment-16964245 ] ASF subversion and git services commented on IMPALA-9108: - Commit 28b1d53f9cb7581974dfc0b2dd75f2f015c1c6b9 in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=28b1d53 ] IMPALA-9108: exclude leveldbjni mvn dependency We don't need this at all - it's pulled in via some transitive dependencies, e.g. htrace and hive-serde. Add an exclusion and add it as a banned dependency. Change-Id: I90b63bc03511545530e1506bc602623591c56e98 Reviewed-on: http://gerrit.cloudera.org:8080/14593 Tested-by: Impala Public Jenkins Reviewed-by: Joe McDonnell > Unused leveldbjni dependency triggers some security scanners > > > Key: IMPALA-9108 > URL: https://issues.apache.org/jira/browse/IMPALA-9108 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Critical > Fix For: Impala 3.4.0 > > > A windows dll in leveldbjni-all-1.8.jar is flagged by some security scanners. > We shouldn't have a dependency on leveldb, so we should exclude this and not > pull in the jar. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-3933) Time zone definitions of Hive/Spark and Impala differ for historical dates
[ https://issues.apache.org/jira/browse/IMPALA-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964206#comment-16964206 ] Csaba Ringhofer edited comment on IMPALA-3933 at 10/31/19 4:54 PM: --- [~mylogi...@gmail.com] I checked dates with close to ~asf-master Impala and CDH/CDP Hives. We work differently depending on the Hive version. {code} >From Hive: create table tdate (d date) stored as parquet; insert into table tdate values ("0001-01-01"), ("1400-01-01"), ("1500-01-01"), ("1800-01-01"); >From Impala: invalidate metadata tdate; select * from tdata; When the data was inserted with CDP Hive, we return all values correctly: ++ | d | ++ | 0001-01-01 | | 1400-01-01 | | 1500-01-01 | | 1800-01-01 | ++ With CDH Hive, the very old dates are shifted, probably related to Julian vs Proleptic Gregorian interpretation of old dates: ++ | d | ++ | NULL | | 1400-01-09 | | 1500-01-10 | | 1800-01-01 | ++ WARNINGS: Parquet file 'hdfs://localhost:20500/test-warehouse/tdate/00_0' column 'd' contains an out of range date. The valid date range is 0001-01-01..-12-31. {code} So dates are also problematic with CDH Hive, but it is a different problem than the one described in the description of the Jira. The original issue is about historical timezone rules, which do not affect dates, but very old dates are still affected by different Julian/Gregorian handling. I think Hive switched to Proleptic Gregorian in Hive 3.1. so it is similar to Impal from that point. was (Author: csringhofer): [~mylogi...@gmail.com] I checked dates with clode to ~asf-master Impala and CDH/CDP Hives. We work differently depending on the Hive version. {code} >From Hive: create table tdate (d date) stored as parquet; insert into table tdate values ("0001-01-01"), ("1400-01-01"), ("1500-01-01"), ("1800-01-01"); >From Impala: invalidate metadata tdate; select * from tdata; When the data was inserted with CDP Hive, we return all values correctly: ++ | d | ++ | 0001-01-01 | | 1400-01-01 | | 1500-01-01 | | 1800-01-01 | ++ With CDH Hive, the very old dates are shifted, probably related to Julian vs Proleptic Gregorian interpretation of old dates: ++ | d | ++ | NULL | | 1400-01-09 | | 1500-01-10 | | 1800-01-01 | ++ WARNINGS: Parquet file 'hdfs://localhost:20500/test-warehouse/tdate/00_0' column 'd' contains an out of range date. The valid date range is 0001-01-01..-12-31. {code} So dates are also problematic with CDH Hive, but it is a different problem than the one described in the description of the Jira. The original issue is about historical timezone rules, which do not affect dates, but very old dates are still affected by different Julian/Gregorian handling. I think Hive switched to Proleptic Gregorian in Hive 3.1. so it is similar to Impal from that point. > Time zone definitions of Hive/Spark and Impala differ for historical dates > -- > > Key: IMPALA-3933 > URL: https://issues.apache.org/jira/browse/IMPALA-3933 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Affects Versions: impala 2.3 >Reporter: Adriano Simone >Priority: Minor > > How the TIMESTAMP skew with convert_legacy_hive_parquet_utc_timestamps=true > Enabling --convert_legacy_hive_parquet_utc_timestamps=true seems to cause > data skew (improper converting) upon the reading for dates earlier than 1900 > (not sure about the exact date). > The following example was run on a server which is in CEST timezone, thus the > time difference is GMT+1 for dates before 1900 (I'm not sure, I haven't > checked the exact starting date of DST computation), and GMT+2 when summer > daylight saving time was applied. > create table itst (col1 int, myts timestamp) stored as parquet; > From impala: > {code:java} > insert into itst values (1,'2016-04-15 12:34:45'); > insert into itst values (2,'1949-04-15 12:34:45'); > insert into itst values (3,'1753-04-15 12:34:45'); > insert into itst values (4,'1752-04-15 12:34:45'); > {code} > from hive > {code:java} > insert into itst values (5,'2016-04-15 12:34:45'); > insert into itst values (6,'1949-04-15 12:34:45'); > insert into itst values (7,'1753-04-15 12:34:45'); > insert into itst values (8,'1752-04-15 12:34:45'); > {code} > From impala > {code:java} > select * from itst order by col1; > {code} > Result: > {code:java} > Query: select * from itst > +--+-+ > | col1 | myts| > +--+-+ > | 1| 2016-04-15 12:34:45 | > | 2| 1949-04-15 12:34:45 | > | 3| 1753-04-15 12:34:45 | > | 4
[jira] [Commented] (IMPALA-3933) Time zone definitions of Hive/Spark and Impala differ for historical dates
[ https://issues.apache.org/jira/browse/IMPALA-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964206#comment-16964206 ] Csaba Ringhofer commented on IMPALA-3933: - [~mylogi...@gmail.com] I checked dates with clode to ~asf-master Impala and CDH/CDP Hives. We work differently depending on the Hive version. {code} >From Hive: create table tdate (d date) stored as parquet; insert into table tdate values ("0001-01-01"), ("1400-01-01"), ("1500-01-01"), ("1800-01-01"); >From Impala: invalidate metadata tdate; select * from tdata; When the data was inserted with CDP Hive, we return all values correctly: ++ | d | ++ | 0001-01-01 | | 1400-01-01 | | 1500-01-01 | | 1800-01-01 | ++ With CDH Hive, the very old dates are shifted, probably related to Julian vs Proleptic Gregorian interpretation of old dates: ++ | d | ++ | NULL | | 1400-01-09 | | 1500-01-10 | | 1800-01-01 | ++ WARNINGS: Parquet file 'hdfs://localhost:20500/test-warehouse/tdate/00_0' column 'd' contains an out of range date. The valid date range is 0001-01-01..-12-31. {code} So dates are also problematic with CDH Hive, but it is a different problem than the one described in the description of the Jira. The original issue is about historical timezone rules, which do not affect dates, but very old dates are still affected by different Julian/Gregorian handling. I think Hive switched to Proleptic Gregorian in Hive 3.1. so it is similar to Impal from that point. > Time zone definitions of Hive/Spark and Impala differ for historical dates > -- > > Key: IMPALA-3933 > URL: https://issues.apache.org/jira/browse/IMPALA-3933 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Affects Versions: impala 2.3 >Reporter: Adriano Simone >Priority: Minor > > How the TIMESTAMP skew with convert_legacy_hive_parquet_utc_timestamps=true > Enabling --convert_legacy_hive_parquet_utc_timestamps=true seems to cause > data skew (improper converting) upon the reading for dates earlier than 1900 > (not sure about the exact date). > The following example was run on a server which is in CEST timezone, thus the > time difference is GMT+1 for dates before 1900 (I'm not sure, I haven't > checked the exact starting date of DST computation), and GMT+2 when summer > daylight saving time was applied. > create table itst (col1 int, myts timestamp) stored as parquet; > From impala: > {code:java} > insert into itst values (1,'2016-04-15 12:34:45'); > insert into itst values (2,'1949-04-15 12:34:45'); > insert into itst values (3,'1753-04-15 12:34:45'); > insert into itst values (4,'1752-04-15 12:34:45'); > {code} > from hive > {code:java} > insert into itst values (5,'2016-04-15 12:34:45'); > insert into itst values (6,'1949-04-15 12:34:45'); > insert into itst values (7,'1753-04-15 12:34:45'); > insert into itst values (8,'1752-04-15 12:34:45'); > {code} > From impala > {code:java} > select * from itst order by col1; > {code} > Result: > {code:java} > Query: select * from itst > +--+-+ > | col1 | myts| > +--+-+ > | 1| 2016-04-15 12:34:45 | > | 2| 1949-04-15 12:34:45 | > | 3| 1753-04-15 12:34:45 | > | 4| 1752-04-15 12:34:45 | > | 5| 2016-04-15 10:34:45 | > | 6| 1949-04-15 10:34:45 | > | 7| 1753-04-15 11:34:45 | > | 8| 1752-04-15 11:34:45 | > +--+-+ > {code} > The timestamps are looking good, the DST differences can be seen (hive > inserted it in local time, but impala shows it in UTC) > From impala after setting the command line argument > "--convert_legacy_hive_parquet_utc_timestamps=true" > {code:java} > select * from itst order by col1; > {code} > The result in this case: > {code:java} > Query: select * from itst order by col1 > +--+-+ > | col1 | myts| > +--+-+ > | 1| 2016-04-15 12:34:45 | > | 2| 1949-04-15 12:34:45 | > | 3| 1753-04-15 12:34:45 | > | 4| 1752-04-15 12:34:45 | > | 5| 2016-04-15 12:34:45 | > | 6| 1949-04-15 12:34:45 | > | 7| 1753-04-15 12:51:05 | > | 8| 1752-04-15 12:51:05 | > +--+-+ > {code} > It seems that instead of 11:34:45 it is showing 12:51:05. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-3933) Time zone definitions of Hive/Spark and Impala differ for historical dates
[ https://issues.apache.org/jira/browse/IMPALA-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964196#comment-16964196 ] Tim Armstrong commented on IMPALA-3933: --- Dates don't have timezones or any kind of conversions, so yes - everything is a lot simpler. > Time zone definitions of Hive/Spark and Impala differ for historical dates > -- > > Key: IMPALA-3933 > URL: https://issues.apache.org/jira/browse/IMPALA-3933 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Affects Versions: impala 2.3 >Reporter: Adriano Simone >Priority: Minor > > How the TIMESTAMP skew with convert_legacy_hive_parquet_utc_timestamps=true > Enabling --convert_legacy_hive_parquet_utc_timestamps=true seems to cause > data skew (improper converting) upon the reading for dates earlier than 1900 > (not sure about the exact date). > The following example was run on a server which is in CEST timezone, thus the > time difference is GMT+1 for dates before 1900 (I'm not sure, I haven't > checked the exact starting date of DST computation), and GMT+2 when summer > daylight saving time was applied. > create table itst (col1 int, myts timestamp) stored as parquet; > From impala: > {code:java} > insert into itst values (1,'2016-04-15 12:34:45'); > insert into itst values (2,'1949-04-15 12:34:45'); > insert into itst values (3,'1753-04-15 12:34:45'); > insert into itst values (4,'1752-04-15 12:34:45'); > {code} > from hive > {code:java} > insert into itst values (5,'2016-04-15 12:34:45'); > insert into itst values (6,'1949-04-15 12:34:45'); > insert into itst values (7,'1753-04-15 12:34:45'); > insert into itst values (8,'1752-04-15 12:34:45'); > {code} > From impala > {code:java} > select * from itst order by col1; > {code} > Result: > {code:java} > Query: select * from itst > +--+-+ > | col1 | myts| > +--+-+ > | 1| 2016-04-15 12:34:45 | > | 2| 1949-04-15 12:34:45 | > | 3| 1753-04-15 12:34:45 | > | 4| 1752-04-15 12:34:45 | > | 5| 2016-04-15 10:34:45 | > | 6| 1949-04-15 10:34:45 | > | 7| 1753-04-15 11:34:45 | > | 8| 1752-04-15 11:34:45 | > +--+-+ > {code} > The timestamps are looking good, the DST differences can be seen (hive > inserted it in local time, but impala shows it in UTC) > From impala after setting the command line argument > "--convert_legacy_hive_parquet_utc_timestamps=true" > {code:java} > select * from itst order by col1; > {code} > The result in this case: > {code:java} > Query: select * from itst order by col1 > +--+-+ > | col1 | myts| > +--+-+ > | 1| 2016-04-15 12:34:45 | > | 2| 1949-04-15 12:34:45 | > | 3| 1753-04-15 12:34:45 | > | 4| 1752-04-15 12:34:45 | > | 5| 2016-04-15 12:34:45 | > | 6| 1949-04-15 12:34:45 | > | 7| 1753-04-15 12:51:05 | > | 8| 1752-04-15 12:51:05 | > +--+-+ > {code} > It seems that instead of 11:34:45 it is showing 12:51:05. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-3933) Time zone definitions of Hive/Spark and Impala differ for historical dates
[ https://issues.apache.org/jira/browse/IMPALA-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964176#comment-16964176 ] Manish Maheshwari commented on IMPALA-3933: --- Question - If we switch to Date instead of TimeStamp, will the issue #1 in the above comment get fixed? > Time zone definitions of Hive/Spark and Impala differ for historical dates > -- > > Key: IMPALA-3933 > URL: https://issues.apache.org/jira/browse/IMPALA-3933 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Affects Versions: impala 2.3 >Reporter: Adriano Simone >Priority: Minor > > How the TIMESTAMP skew with convert_legacy_hive_parquet_utc_timestamps=true > Enabling --convert_legacy_hive_parquet_utc_timestamps=true seems to cause > data skew (improper converting) upon the reading for dates earlier than 1900 > (not sure about the exact date). > The following example was run on a server which is in CEST timezone, thus the > time difference is GMT+1 for dates before 1900 (I'm not sure, I haven't > checked the exact starting date of DST computation), and GMT+2 when summer > daylight saving time was applied. > create table itst (col1 int, myts timestamp) stored as parquet; > From impala: > {code:java} > insert into itst values (1,'2016-04-15 12:34:45'); > insert into itst values (2,'1949-04-15 12:34:45'); > insert into itst values (3,'1753-04-15 12:34:45'); > insert into itst values (4,'1752-04-15 12:34:45'); > {code} > from hive > {code:java} > insert into itst values (5,'2016-04-15 12:34:45'); > insert into itst values (6,'1949-04-15 12:34:45'); > insert into itst values (7,'1753-04-15 12:34:45'); > insert into itst values (8,'1752-04-15 12:34:45'); > {code} > From impala > {code:java} > select * from itst order by col1; > {code} > Result: > {code:java} > Query: select * from itst > +--+-+ > | col1 | myts| > +--+-+ > | 1| 2016-04-15 12:34:45 | > | 2| 1949-04-15 12:34:45 | > | 3| 1753-04-15 12:34:45 | > | 4| 1752-04-15 12:34:45 | > | 5| 2016-04-15 10:34:45 | > | 6| 1949-04-15 10:34:45 | > | 7| 1753-04-15 11:34:45 | > | 8| 1752-04-15 11:34:45 | > +--+-+ > {code} > The timestamps are looking good, the DST differences can be seen (hive > inserted it in local time, but impala shows it in UTC) > From impala after setting the command line argument > "--convert_legacy_hive_parquet_utc_timestamps=true" > {code:java} > select * from itst order by col1; > {code} > The result in this case: > {code:java} > Query: select * from itst order by col1 > +--+-+ > | col1 | myts| > +--+-+ > | 1| 2016-04-15 12:34:45 | > | 2| 1949-04-15 12:34:45 | > | 3| 1753-04-15 12:34:45 | > | 4| 1752-04-15 12:34:45 | > | 5| 2016-04-15 12:34:45 | > | 6| 1949-04-15 12:34:45 | > | 7| 1753-04-15 12:51:05 | > | 8| 1752-04-15 12:51:05 | > +--+-+ > {code} > It seems that instead of 11:34:45 it is showing 12:51:05. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9112) Consider removing hdfsExists calls when writing files to S3
[ https://issues.apache.org/jira/browse/IMPALA-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated IMPALA-9112: - Summary: Consider removing hdfsExists calls when writing files to S3 (was: Consider removing hdfsExists calls when writing out files) > Consider removing hdfsExists calls when writing files to S3 > --- > > Key: IMPALA-9112 > URL: https://issues.apache.org/jira/browse/IMPALA-9112 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > > There are a few places in the backend where we call {{hdfsExists}} before > writing out a file. This can cause issues when writing data to S3, because S3 > can cache 404 Not Found errors. This issue manifests itself with errors such > as: > {code:java} > ERROR: Error(s) moving partition files. First error (of 1) was: Hdfs op > (RENAME > s3a://[bucket-name]/[table-name]/_impala_insert_staging/3943ae7ccf00711e_59606d88/.3943ae7ccf00711e-59606d88000b_562151879_dir/year=2015/3943ae7ccf00711e-59606d88000b_1994902389_data.0.parq > TO > s3a://[bucket-name]/[table-name]/3943ae7ccf00711e-59606d88000b_1994902389_data.0.parq) > failed, error was: > s3a://[bucket-name]/[table-name]/_impala_insert_staging/3943ae7ccf00711e_59606d88/.3943ae7ccf00711e-59606d88000b_562151879_dir/year=2015/3943ae7ccf00711e-59606d88000b_1994902389_data.0.parq > Error(5): Input/output error > Root cause: AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: > 404; Error Code: 404 Not Found; Request ID: []; S3 Extended Request ID: > []){code} > HADOOP-13884, HADOOP-13950, HADOOP-16490 - the HDFS clients allow specifying > an "overwrite" option when creating a file; this can avoid doing any HEAD > requests when opening a file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-9112) Consider removing hdfsExists calls when writing out files
Sahil Takiar created IMPALA-9112: Summary: Consider removing hdfsExists calls when writing out files Key: IMPALA-9112 URL: https://issues.apache.org/jira/browse/IMPALA-9112 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Sahil Takiar Assignee: Sahil Takiar There are a few places in the backend where we call {{hdfsExists}} before writing out a file. This can cause issues when writing data to S3, because S3 can cache 404 Not Found errors. This issue manifests itself with errors such as: {code:java} ERROR: Error(s) moving partition files. First error (of 1) was: Hdfs op (RENAME s3a://[bucket-name]/[table-name]/_impala_insert_staging/3943ae7ccf00711e_59606d88/.3943ae7ccf00711e-59606d88000b_562151879_dir/year=2015/3943ae7ccf00711e-59606d88000b_1994902389_data.0.parq TO s3a://[bucket-name]/[table-name]/3943ae7ccf00711e-59606d88000b_1994902389_data.0.parq) failed, error was: s3a://[bucket-name]/[table-name]/_impala_insert_staging/3943ae7ccf00711e_59606d88/.3943ae7ccf00711e-59606d88000b_562151879_dir/year=2015/3943ae7ccf00711e-59606d88000b_1994902389_data.0.parq Error(5): Input/output error Root cause: AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: []; S3 Extended Request ID: []){code} HADOOP-13884, HADOOP-13950, HADOOP-16490 - the HDFS clients allow specifying an "overwrite" option when creating a file; this can avoid doing any HEAD requests when opening a file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-9112) Consider removing hdfsExists calls when writing out files
Sahil Takiar created IMPALA-9112: Summary: Consider removing hdfsExists calls when writing out files Key: IMPALA-9112 URL: https://issues.apache.org/jira/browse/IMPALA-9112 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Sahil Takiar Assignee: Sahil Takiar There are a few places in the backend where we call {{hdfsExists}} before writing out a file. This can cause issues when writing data to S3, because S3 can cache 404 Not Found errors. This issue manifests itself with errors such as: {code:java} ERROR: Error(s) moving partition files. First error (of 1) was: Hdfs op (RENAME s3a://[bucket-name]/[table-name]/_impala_insert_staging/3943ae7ccf00711e_59606d88/.3943ae7ccf00711e-59606d88000b_562151879_dir/year=2015/3943ae7ccf00711e-59606d88000b_1994902389_data.0.parq TO s3a://[bucket-name]/[table-name]/3943ae7ccf00711e-59606d88000b_1994902389_data.0.parq) failed, error was: s3a://[bucket-name]/[table-name]/_impala_insert_staging/3943ae7ccf00711e_59606d88/.3943ae7ccf00711e-59606d88000b_562151879_dir/year=2015/3943ae7ccf00711e-59606d88000b_1994902389_data.0.parq Error(5): Input/output error Root cause: AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: []; S3 Extended Request ID: []){code} HADOOP-13884, HADOOP-13950, HADOOP-16490 - the HDFS clients allow specifying an "overwrite" option when creating a file; this can avoid doing any HEAD requests when opening a file. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IMPALA-9111) Sorting 'Decimal16Value's with codegen enabled but codegen optimizations disabled fails
[ https://issues.apache.org/jira/browse/IMPALA-9111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Becker updated IMPALA-9111: -- Description: Starting the Impala cluster with {code:java} bin/start-impala-cluster.py --impalad_args="-disable_optimization_passes"{code} the following query fails and Impala crashes: {code:java} SELECT d28_1 FROM functional.decimal_rtf_tbl ORDER BY d28_1;{code} This error happens if the inlining pass in OptimizeModule in be/src/codegen/llvm-codegen.cc is not run. It seems the problem only happens with decimals that need to be stored on 16 bytes. Maybe it is some ABI incompatibility with Decimal16Value. Stack trace: {code:java} #0 0x7fda6e63e428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 #1 0x7fda6e64002a in __GI_abort () at abort.c:89 #2 0x7fda71707149 in os::abort(bool) () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #3 0x7fda718bad27 in VMError::report_and_die() () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #4 0x7fda71710e4f in JVM_handle_linux_signal () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #5 0x7fda71703e48 in signalHandler(int, siginfo_t*, void*) () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #6 #7 0x7fd9c3437f8b in impala::RawValue::Compare(void const*, void const*, impala::ColumnType const&) () #8 0x7fd9c3438e25 in Compare () #9 0x02a26293 in impala::TupleRowComparator::Compare (rhs=0x7fd9c3c4a8b8, lhs=0x7fd9c3c4a8c0, this=0x1284e480) at be/src/util/tuple-row-compare.h:98 #10 impala::TupleRowComparator::Less (rhs=0x7fd9c3c4a8b8, lhs=0x7fd9c3c4a8c0, this=0x1284e480) at be/src/util/tuple-row-compare.h:107 #11 impala::Sorter::TupleSorter::Less (this=0x137b2000, lhs=0x7fd9c3c4a8c0, rhs=0x7fd9c3c4a8b8) at be/src/runtime/sorter-ir.cc:72 #12 0x02a27409 in impala::Sorter::TupleSorter::MedianOfThree (this=0x137b2000, t1=0x14808e50, t2=0x14802d3f, t3=0x14808085) at be/src/runtime/sorter-ir.cc:214 #13 0x02a27394 in impala::Sorter::TupleSorter::SelectPivot (this=0x137b2000, begin=..., end=...) at be/src/runtime/sorter-ir.cc:206 #14 0x02a26cd8 in impala::Sorter::TupleSorter::SortHelper (this=0x137b2000, begin=..., end=...) at be/src/runtime/sorter-ir.cc:165 #15 0x02a15e8a in impala::Sorter::TupleSorter::Sort (this=0x137b2000, run=0x13974da0) at be/src/runtime/sorter.cc:755 #16 0x02a18e27 in impala::Sorter::SortCurrentInputRun (this=0x1284e3c0) at be/src/runtime/sorter.cc:956 #17 0x02a183e7 in impala::Sorter::InputDone (this=0x1284e3c0) at be/src/runtime/sorter.cc:892 #18 0x0263bc18 in impala::SortNode::SortInput (this=0xdf63e40, state=0x11e652a0) at be/src/exec/sort-node.cc:187 #19 0x0263a8e0 in impala::SortNode::Open (this=0xdf63e40, state=0x11e652a0) at be/src/exec/sort-node.cc:90 #20 0x020f289a in impala::FragmentInstanceState::Open (this=0xe0571e0) at be/src/runtime/fragment-instance-state.cc:348 #21 0x020ef54c in impala::FragmentInstanceState::Exec (this=0xe0571e0) at be/src/runtime/fragment-instance-state.cc:84 #22 0x02102f9b in impala::QueryState::ExecFInstance (this=0xd376000, fis=0xe0571e0) at be/src/runtime/query-state.cc:650 #23 0x02101268 in impala::QueryStateoperator()(void) const (__closure=0x7fd9c3c4bca8) at be/src/runtime/query-state.cc:558 #24 0x02104c7d in boost::detail::function::void_function_obj_invoker0, void>::invoke(boost::detail::function::function_buffer &) (function_obj_ptr=...) at toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153 #25 0x01f04b46 in boost::function0::operator() (this=0x7fd9c3c4bca0) at toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767 #26 0x0247bafd in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function, impala::ThreadDebugInfo const*, impala::Promise*) (Python Exception No type named class std::basic_string, std::allocator >::_Rep.: name=, Python Exception No type named class std::basic_string, std::allocator >::_Rep.: category=, functor=..., parent_thread_info=0x7fd9c4c4d950, thread_started=0x7fd9c4c4c8f0) at be/src/util/thread.cc:360 #27 0x02483e81 in boost::_bi::list5, boost::_bi::value, boost::_bi::value >, boost::_bi::value, boost::_bi::value*> >::operator(), impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, std::string const&, boost::function, impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0&, int) (this=0xd3857c0, f=@0xd3857b8: 0x247b796 , impala::ThreadDebugInfo const*, impala::Promise*)>, a=...) at toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525 #28 0x02483da5 in boost::_bi::bind_t,
[jira] [Created] (IMPALA-9111) Sorting 'Decimal16Value's with codegen enabled but codegen optimizations disabled fails
Daniel Becker created IMPALA-9111: - Summary: Sorting 'Decimal16Value's with codegen enabled but codegen optimizations disabled fails Key: IMPALA-9111 URL: https://issues.apache.org/jira/browse/IMPALA-9111 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Daniel Becker Starting the Impala cluster with ``` bin/start-impala-cluster.py --impalad_args="-disable_optimization_passes" ``` the following query fails and Impala crashes: ``` SELECT d28_1 FROM functional.decimal_rtf_tbl ORDER BY d28_1; ``` This error happens if the inlining pass in OptimizeModule in be/src/codegen/llvm-codegen.cc is not run. It seems the problem only happens with decimals that need to be stored on 16 bytes. Maybe it is some ABI incompatibility with Decimal16Value. Stack trace: #0 0x7fda6e63e428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 #1 0x7fda6e64002a in __GI_abort () at abort.c:89 #2 0x7fda71707149 in os::abort(bool) () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #3 0x7fda718bad27 in VMError::report_and_die() () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #4 0x7fda71710e4f in JVM_handle_linux_signal () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #5 0x7fda71703e48 in signalHandler(int, siginfo_t*, void*) () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #6 #7 0x7fd9c3437f8b in impala::RawValue::Compare(void const*, void const*, impala::ColumnType const&) () #8 0x7fd9c3438e25 in Compare () #9 0x02a26293 in impala::TupleRowComparator::Compare (rhs=0x7fd9c3c4a8b8, lhs=0x7fd9c3c4a8c0, this=0x1284e480) at be/src/util/tuple-row-compare.h:98 #10 impala::TupleRowComparator::Less (rhs=0x7fd9c3c4a8b8, lhs=0x7fd9c3c4a8c0, this=0x1284e480) at be/src/util/tuple-row-compare.h:107 #11 impala::Sorter::TupleSorter::Less (this=0x137b2000, lhs=0x7fd9c3c4a8c0, rhs=0x7fd9c3c4a8b8) at be/src/runtime/sorter-ir.cc:72 #12 0x02a27409 in impala::Sorter::TupleSorter::MedianOfThree (this=0x137b2000, t1=0x14808e50, t2=0x14802d3f, t3=0x14808085) at be/src/runtime/sorter-ir.cc:214 #13 0x02a27394 in impala::Sorter::TupleSorter::SelectPivot (this=0x137b2000, begin=..., end=...) at be/src/runtime/sorter-ir.cc:206 #14 0x02a26cd8 in impala::Sorter::TupleSorter::SortHelper (this=0x137b2000, begin=..., end=...) at be/src/runtime/sorter-ir.cc:165 #15 0x02a15e8a in impala::Sorter::TupleSorter::Sort (this=0x137b2000, run=0x13974da0) at be/src/runtime/sorter.cc:755 #16 0x02a18e27 in impala::Sorter::SortCurrentInputRun (this=0x1284e3c0) at be/src/runtime/sorter.cc:956 #17 0x02a183e7 in impala::Sorter::InputDone (this=0x1284e3c0) at be/src/runtime/sorter.cc:892 #18 0x0263bc18 in impala::SortNode::SortInput (this=0xdf63e40, state=0x11e652a0) at be/src/exec/sort-node.cc:187 #19 0x0263a8e0 in impala::SortNode::Open (this=0xdf63e40, state=0x11e652a0) at be/src/exec/sort-node.cc:90 #20 0x020f289a in impala::FragmentInstanceState::Open (this=0xe0571e0) at be/src/runtime/fragment-instance-state.cc:348 #21 0x020ef54c in impala::FragmentInstanceState::Exec (this=0xe0571e0) at be/src/runtime/fragment-instance-state.cc:84 #22 0x02102f9b in impala::QueryState::ExecFInstance (this=0xd376000, fis=0xe0571e0) at be/src/runtime/query-state.cc:650 #23 0x02101268 in impala::QueryStateoperator()(void) const (__closure=0x7fd9c3c4bca8) at be/src/runtime/query-state.cc:558 #24 0x02104c7d in boost::detail::function::void_function_obj_invoker0, void>::invoke(boost::detail::function::function_buffer &) (function_obj_ptr=...) at toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153 #25 0x01f04b46 in boost::function0::operator() (this=0x7fd9c3c4bca0) at toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767 #26 0x0247bafd in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function, impala::ThreadDebugInfo const*, impala::Promise*) (Python Exception No type named class std::basic_string, std::allocator >::_Rep.: name=, Python Exception No type named class std::basic_string, std::allocator >::_Rep.: category=, functor=..., parent_thread_info=0x7fd9c4c4d950, thread_started=0x7fd9c4c4c8f0) at be/src/util/thread.cc:360 #27 0x02483e81 in boost::_bi::list5, boost::_bi::value, boost::_bi::value >, boost::_bi::value, boost::_bi::value*> >::operator(), impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, std::string const&, boost::function, impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0&, int) (this=0xd3857c0, f=@0xd3857b8: 0x247b796 , impala::ThreadDebugInfo
[jira] [Created] (IMPALA-9111) Sorting 'Decimal16Value's with codegen enabled but codegen optimizations disabled fails
Daniel Becker created IMPALA-9111: - Summary: Sorting 'Decimal16Value's with codegen enabled but codegen optimizations disabled fails Key: IMPALA-9111 URL: https://issues.apache.org/jira/browse/IMPALA-9111 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Daniel Becker Starting the Impala cluster with ``` bin/start-impala-cluster.py --impalad_args="-disable_optimization_passes" ``` the following query fails and Impala crashes: ``` SELECT d28_1 FROM functional.decimal_rtf_tbl ORDER BY d28_1; ``` This error happens if the inlining pass in OptimizeModule in be/src/codegen/llvm-codegen.cc is not run. It seems the problem only happens with decimals that need to be stored on 16 bytes. Maybe it is some ABI incompatibility with Decimal16Value. Stack trace: #0 0x7fda6e63e428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 #1 0x7fda6e64002a in __GI_abort () at abort.c:89 #2 0x7fda71707149 in os::abort(bool) () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #3 0x7fda718bad27 in VMError::report_and_die() () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #4 0x7fda71710e4f in JVM_handle_linux_signal () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #5 0x7fda71703e48 in signalHandler(int, siginfo_t*, void*) () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #6 #7 0x7fd9c3437f8b in impala::RawValue::Compare(void const*, void const*, impala::ColumnType const&) () #8 0x7fd9c3438e25 in Compare () #9 0x02a26293 in impala::TupleRowComparator::Compare (rhs=0x7fd9c3c4a8b8, lhs=0x7fd9c3c4a8c0, this=0x1284e480) at be/src/util/tuple-row-compare.h:98 #10 impala::TupleRowComparator::Less (rhs=0x7fd9c3c4a8b8, lhs=0x7fd9c3c4a8c0, this=0x1284e480) at be/src/util/tuple-row-compare.h:107 #11 impala::Sorter::TupleSorter::Less (this=0x137b2000, lhs=0x7fd9c3c4a8c0, rhs=0x7fd9c3c4a8b8) at be/src/runtime/sorter-ir.cc:72 #12 0x02a27409 in impala::Sorter::TupleSorter::MedianOfThree (this=0x137b2000, t1=0x14808e50, t2=0x14802d3f, t3=0x14808085) at be/src/runtime/sorter-ir.cc:214 #13 0x02a27394 in impala::Sorter::TupleSorter::SelectPivot (this=0x137b2000, begin=..., end=...) at be/src/runtime/sorter-ir.cc:206 #14 0x02a26cd8 in impala::Sorter::TupleSorter::SortHelper (this=0x137b2000, begin=..., end=...) at be/src/runtime/sorter-ir.cc:165 #15 0x02a15e8a in impala::Sorter::TupleSorter::Sort (this=0x137b2000, run=0x13974da0) at be/src/runtime/sorter.cc:755 #16 0x02a18e27 in impala::Sorter::SortCurrentInputRun (this=0x1284e3c0) at be/src/runtime/sorter.cc:956 #17 0x02a183e7 in impala::Sorter::InputDone (this=0x1284e3c0) at be/src/runtime/sorter.cc:892 #18 0x0263bc18 in impala::SortNode::SortInput (this=0xdf63e40, state=0x11e652a0) at be/src/exec/sort-node.cc:187 #19 0x0263a8e0 in impala::SortNode::Open (this=0xdf63e40, state=0x11e652a0) at be/src/exec/sort-node.cc:90 #20 0x020f289a in impala::FragmentInstanceState::Open (this=0xe0571e0) at be/src/runtime/fragment-instance-state.cc:348 #21 0x020ef54c in impala::FragmentInstanceState::Exec (this=0xe0571e0) at be/src/runtime/fragment-instance-state.cc:84 #22 0x02102f9b in impala::QueryState::ExecFInstance (this=0xd376000, fis=0xe0571e0) at be/src/runtime/query-state.cc:650 #23 0x02101268 in impala::QueryStateoperator()(void) const (__closure=0x7fd9c3c4bca8) at be/src/runtime/query-state.cc:558 #24 0x02104c7d in boost::detail::function::void_function_obj_invoker0, void>::invoke(boost::detail::function::function_buffer &) (function_obj_ptr=...) at toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153 #25 0x01f04b46 in boost::function0::operator() (this=0x7fd9c3c4bca0) at toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767 #26 0x0247bafd in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function, impala::ThreadDebugInfo const*, impala::Promise*) (Python Exception No type named class std::basic_string, std::allocator >::_Rep.: name=, Python Exception No type named class std::basic_string, std::allocator >::_Rep.: category=, functor=..., parent_thread_info=0x7fd9c4c4d950, thread_started=0x7fd9c4c4c8f0) at be/src/util/thread.cc:360 #27 0x02483e81 in boost::_bi::list5, boost::_bi::value, boost::_bi::value >, boost::_bi::value, boost::_bi::value*> >::operator(), impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, std::string const&, boost::function, impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0&, int) (this=0xd3857c0, f=@0xd3857b8: 0x247b796 , impala::ThreadDebugInfo
[jira] [Updated] (IMPALA-9013) Column Masking DML support
[ https://issues.apache.org/jira/browse/IMPALA-9013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9013: Priority: Critical (was: Major) > Column Masking DML support > -- > > Key: IMPALA-9013 > URL: https://issues.apache.org/jira/browse/IMPALA-9013 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Kurt Deschler >Priority: Critical > > Review Hive implementation to see if anything special needs to be done for > DML. The Hive column masking design doc does not reflect the current code. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9011) Support column masking on CTEs, views, and derived column names
[ https://issues.apache.org/jira/browse/IMPALA-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9011: Priority: Critical (was: Major) > Support column masking on CTEs, views, and derived column names > --- > > Key: IMPALA-9011 > URL: https://issues.apache.org/jira/browse/IMPALA-9011 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Kurt Deschler >Priority: Critical > > CTE/views: dig out underlying column and table names > derived column names i.e. select * from (select 1) as foo - Handle > appropriately. > Also negative cases where the query has an invalid reference. i.e. > WITH foo AS (SELECT c1 FROM t1) SELECT c1 FROM FOO; -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9012) Allow access to columns with column masks and update tests
[ https://issues.apache.org/jira/browse/IMPALA-9012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9012: Priority: Critical (was: Major) > Allow access to columns with column masks and update tests > -- > > Key: IMPALA-9012 > URL: https://issues.apache.org/jira/browse/IMPALA-9012 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Kurt Deschler >Priority: Critical > > Remove check in RangerAuthorizationChecker::authorizeTableAccess > Remove testcase in RangerAuditLogTest.java -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9010) Support pre-defined mask types from Ranger UI
[ https://issues.apache.org/jira/browse/IMPALA-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9010: Priority: Critical (was: Major) > Support pre-defined mask types from Ranger UI > - > > Key: IMPALA-9010 > URL: https://issues.apache.org/jira/browse/IMPALA-9010 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Kurt Deschler >Priority: Critical > > Review Hive implementation/behavior. > Redact/Partial/Hash/Nullify/Unmasked/Date > These will be implemented as static SQL transforms in Impala -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9079) Add Auth Interfaces to retrieve column masks and implement for Ranger
[ https://issues.apache.org/jira/browse/IMPALA-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9079: Priority: Critical (was: Major) > Add Auth Interfaces to retrieve column masks and implement for Ranger > - > > Key: IMPALA-9079 > URL: https://issues.apache.org/jira/browse/IMPALA-9079 > Project: IMPALA > Issue Type: New Feature >Reporter: Kurt Deschler >Priority: Critical > > Masks definitions can be retrieved from the ranger plugin. Analyzer has > access to AuthorizationFactory via Analyzer::getAuthzFactory(). There are > currently no interfaces through AuthorizationFactory or AuthorizationChecker > to access the column masks from the plugin. These will need to be added and > then implemented for the Ranger plugin. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9009) Core support for column mask transformation in select list
[ https://issues.apache.org/jira/browse/IMPALA-9009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9009: Priority: Critical (was: Major) > Core support for column mask transformation in select list > -- > > Key: IMPALA-9009 > URL: https://issues.apache.org/jira/browse/IMPALA-9009 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Kurt Deschler >Priority: Critical > > Identify masked columns from SELECT list. > Support custom (user supplied) mask SQL from Ranger. > Parse column mask expressions and substitute into original statement -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9089) Failed to link impalad on SUSE12
[ https://issues.apache.org/jira/browse/IMPALA-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Donghui Xu updated IMPALA-9089: --- Description: Failed to link impalad on SUSE12, as follows: [100%] Linking CXX executable ../../build/release/service/impalad /toolchain/binutils-2.26.1/bin/ld.gold: internal error in find_view, at fileread.cc:336 collect2: error: ld returned 1 exit status CMakeError.log content is as following: /toolchain/cmake-3.14.3/bin/cmake -E cmake_link_script CMakeFiles/cmTC_31214.dir/link.txt --verbose=1 /toolchain/gcc-4.9.2/bin/gcc -DCHECK_FUNCTION_EXISTS=pthread_create -rdynamic CMakeFiles/cmTC_31214.dir/CheckFunctionExists.c.o -o cmTC_31214 -lpthreads /usr/bin/ld: cannot find -lpthreads collect2: error: ld returned 1 exit status was: Failed to link impalad on SUSE12, as follows: [100%] Linking CXX executable ../../build/release/service/impalad /toolchain/binutils-2.26.1/bin/ld.gold: internal error in find_view, at fileread.cc:336 collect2: error: ld returned 1 exit status CMakeError.log context is as following: /toolchain/cmake-3.14.3/bin/cmake -E cmake_link_script CMakeFiles/cmTC_31214.dir/link.txt --verbose=1 /toolchain/gcc-4.9.2/bin/gcc -DCHECK_FUNCTION_EXISTS=pthread_create -rdynamic CMakeFiles/cmTC_31214.dir/CheckFunctionExists.c.o -o cmTC_31214 -lpthreads /usr/bin/ld: cannot find -lpthreads collect2: error: ld returned 1 exit status > Failed to link impalad on SUSE12 > > > Key: IMPALA-9089 > URL: https://issues.apache.org/jira/browse/IMPALA-9089 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.3.0 >Reporter: Donghui Xu >Priority: Minor > > Failed to link impalad on SUSE12, as follows: > [100%] Linking CXX executable ../../build/release/service/impalad > /toolchain/binutils-2.26.1/bin/ld.gold: internal error in find_view, at > fileread.cc:336 > collect2: error: ld returned 1 exit status > CMakeError.log content is as following: > /toolchain/cmake-3.14.3/bin/cmake -E cmake_link_script > CMakeFiles/cmTC_31214.dir/link.txt --verbose=1 > /toolchain/gcc-4.9.2/bin/gcc -DCHECK_FUNCTION_EXISTS=pthread_create > -rdynamic CMakeFiles/cmTC_31214.dir/CheckFunctionExists.c.o -o cmTC_31214 > -lpthreads > /usr/bin/ld: cannot find -lpthreads > collect2: error: ld returned 1 exit status -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9089) Failed to link impalad on SUSE12
[ https://issues.apache.org/jira/browse/IMPALA-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Donghui Xu updated IMPALA-9089: --- Description: Failed to link impalad on SUSE12, as follows: [100%] Linking CXX executable ../../build/release/service/impalad /toolchain/binutils-2.26.1/bin/ld.gold: internal error in find_view, at fileread.cc:336 collect2: error: ld returned 1 exit status CMakeError.log context is as following: /toolchain/cmake-3.14.3/bin/cmake -E cmake_link_script CMakeFiles/cmTC_31214.dir/link.txt --verbose=1 /toolchain/gcc-4.9.2/bin/gcc -DCHECK_FUNCTION_EXISTS=pthread_create -rdynamic CMakeFiles/cmTC_31214.dir/CheckFunctionExists.c.o -o cmTC_31214 -lpthreads /usr/bin/ld: cannot find -lpthreads collect2: error: ld returned 1 exit status was: Failed to link impalad on SUSE12, as follows: [100%] Linking CXX executable ../../build/release/service/impalad /toolchain/binutils-2.26.1/bin/ld.gold: internal error in find_view, at fileread.cc:336 collect2: error: ld returned 1 exit status > Failed to link impalad on SUSE12 > > > Key: IMPALA-9089 > URL: https://issues.apache.org/jira/browse/IMPALA-9089 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.3.0 >Reporter: Donghui Xu >Priority: Minor > > Failed to link impalad on SUSE12, as follows: > [100%] Linking CXX executable ../../build/release/service/impalad > /toolchain/binutils-2.26.1/bin/ld.gold: internal error in find_view, at > fileread.cc:336 > collect2: error: ld returned 1 exit status > CMakeError.log context is as following: > /toolchain/cmake-3.14.3/bin/cmake -E cmake_link_script > CMakeFiles/cmTC_31214.dir/link.txt --verbose=1 > /toolchain/gcc-4.9.2/bin/gcc -DCHECK_FUNCTION_EXISTS=pthread_create > -rdynamic CMakeFiles/cmTC_31214.dir/CheckFunctionExists.c.o -o cmTC_31214 > -lpthreads > /usr/bin/ld: cannot find -lpthreads > collect2: error: ld returned 1 exit status -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-3933) Time zone definitions of Hive/Spark and Impala differ for historical dates
[ https://issues.apache.org/jira/browse/IMPALA-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963707#comment-16963707 ] Manish Maheshwari commented on IMPALA-3933: --- This is the current behaviour on impala-2.11 Issue 1 - {code:java} We will get the below error if a date < 1400/01/01 is inserted. WARNINGS: Parquet file 'hdfs://ns1/user/hive/warehouse/abc/00_0_copy_5' column 'ts' contains an out of range timestamp. The valid date range is 1400-01-01..-12-31. {code} Issue 2 and workaround using Hive UDF's in Impala {code:java} 1) Set these in Impala - -convert_legacy_hive_parquet_utc_timestamps=true -use_local_tz_for_unix_timestamp_conversions=true 2) In Hive - beeline>create table abc(ts timestamp) stored as parquet; beeline>insert into abc values ('1400-12-12 00:00:00'); beeline>insert into abc values ('1400-9-12 00:00:00'); beeline>insert into abc values ('1500-9-12 00:00:00'); beeline>insert into abc values ('1500-10-12 00:00:00'); beeline>select * from abc; +abc.ts+ 1400-12-12 00:00:00.0 1400-09-12 00:00:00.0 1500-09-12 00:00:00.0 1500-10-12 00:00:00.0 + impala-shell>invalidate metadata; impala-shell>select * from abc; ## This is not the right output -ts- 1400-09-21 08:00:00 1500-09-22 08:00:00 1400-12-21 08:00:00 1500-10-22 08:00:00 - Fetched 4 row(s) in 2.96s ## Now using the Hive UDF in Impala, copy hive-exec-1.1.0-cdh5.x.x.jar as /tmp/hive-udf.jar create function hive_unixtime location '/tmp/hive-udf.jar' symbol='org.apache.hadoop.hive.ql.udf.UDFFromUnixTime' impala-shell> select hive_unixtime(unix_timestamp(ts),'/MM/dd HH:mm') as _mm_dd_hh_mm from abc; --_mm_dd_hh_mm-- 1400/09/12 00:00 1500/09/12 00:00 1400/12/12 00:00 1500/10/12 00:00 -- Fetched 4 row(s) in 0.43s {code} > Time zone definitions of Hive/Spark and Impala differ for historical dates > -- > > Key: IMPALA-3933 > URL: https://issues.apache.org/jira/browse/IMPALA-3933 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Affects Versions: impala 2.3 >Reporter: Adriano Simone >Priority: Minor > > How the TIMESTAMP skew with convert_legacy_hive_parquet_utc_timestamps=true > Enabling --convert_legacy_hive_parquet_utc_timestamps=true seems to cause > data skew (improper converting) upon the reading for dates earlier than 1900 > (not sure about the exact date). > The following example was run on a server which is in CEST timezone, thus the > time difference is GMT+1 for dates before 1900 (I'm not sure, I haven't > checked the exact starting date of DST computation), and GMT+2 when summer > daylight saving time was applied. > create table itst (col1 int, myts timestamp) stored as parquet; > From impala: > {code:java} > insert into itst values (1,'2016-04-15 12:34:45'); > insert into itst values (2,'1949-04-15 12:34:45'); > insert into itst values (3,'1753-04-15 12:34:45'); > insert into itst values (4,'1752-04-15 12:34:45'); > {code} > from hive > {code:java} > insert into itst values (5,'2016-04-15 12:34:45'); > insert into itst values (6,'1949-04-15 12:34:45'); > insert into itst values (7,'1753-04-15 12:34:45'); > insert into itst values (8,'1752-04-15 12:34:45'); > {code} > From impala > {code:java} > select * from itst order by col1; > {code} > Result: > {code:java} > Query: select * from itst > +--+-+ > | col1 | myts| > +--+-+ > | 1| 2016-04-15 12:34:45 | > | 2| 1949-04-15 12:34:45 | > | 3| 1753-04-15 12:34:45 | > | 4| 1752-04-15 12:34:45 | > | 5| 2016-04-15 10:34:45 | > | 6| 1949-04-15 10:34:45 | > | 7| 1753-04-15 11:34:45 | > | 8| 1752-04-15 11:34:45 | > +--+-+ > {code} > The timestamps are looking good, the DST differences can be seen (hive > inserted it in local time, but impala shows it in UTC) > From impala after setting the command line argument > "--convert_legacy_hive_parquet_utc_timestamps=true" > {code:java} > select * from itst order by col1; > {code} > The result in this case: > {code:java} > Query: select * from itst order by col1 > +--+-+ > | col1 | myts| > +--+-+ > | 1| 2016-04-15 12:34:45 | > | 2| 1949-04-15 12:34:45 | > | 3| 1753-04-15 12:34:45 | > | 4| 1752-04-15 12:34:45 | > | 5| 2016-04-15 12:34:45 | > | 6| 1949-04-15 12:34:45 | > | 7| 1753-04-15 12:51:05 | > | 8| 1752-04-15 12:51:05 | > +--+-+ > {code} > It seems that instead of 11:34:45 it is showing 12:51:05. -- This message was sent by