Bikramjeet Vig created IMPALA-10783:
---------------------------------------

             Summary: run_and_verify_query_cancellation_test flakiness and 
improper error handling in TestImpalaShell
                 Key: IMPALA-10783
                 URL: https://issues.apache.org/jira/browse/IMPALA-10783
             Project: IMPALA
          Issue Type: Bug
    Affects Versions: Impala 4.0
            Reporter: Bikramjeet Vig
            Assignee: Bikramjeet Vig


Some tests in TestImpalaShell run impala-shell in a seperate process but don't 
handle the case where the test can fail and the impala-shell process can linger 
on.

One such test run_and_verify_query_cancellation_test, failed due to flakiness 
and since it ran a query that returned a large result, the impala-shell process 
lingered on while fetching results. This caused the query to hold on to 
resources and starve the cluster of memory which caused other tests to fail due 
to not enough memory being available.

The flakiness in run_and_verify_query_cancellation_test was:
{noformat}
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/shell/test_shell_commandline.py:414:
 in test_query_cancellation_during_wait_to_finish
    self.run_and_verify_query_cancellation_test(vector, stmt, "RUNNING")
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/shell/test_shell_commandline.py:422:
 in run_and_verify_query_cancellation_test
    wait_for_query_state(vector, stmt, cancel_at_state)
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/shell/util.py:330:
 in wait_for_query_state
    raise Exception(exc_text)
E   Exception: The found in flight query is not the one under test: set all
{noformat}
the test checked for running queries too fast while the impala-shell was 
starting up. the impala-shell runs "set all" when it starts which the test 
picked up and raised an error thinking it did find its query.

The result of this lingering query caused other tests to fail and throw errors 
like:

{noformat}
query_test/test_tpcds_queries.py:107: in test_tpcds_q18a
    self.run_test_case(self.get_workload() + '-q18a', vector)
common/impala_test_suite.py:678: in run_test_case
    result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
common/impala_test_suite.py:616: in __exec_in_impala
    result = self.__execute_query(target_impalad_client, query, user=user)
common/impala_test_suite.py:936: in __execute_query
    return impalad_client.execute(query, user=user)
common/impala_connection.py:205: in execute
    return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:189: in execute
    handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:367: in __execute_query
    self.wait_for_finished(handle)
beeswax/impala_beeswax.py:388: in wait_for_finished
    raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
E    Query aborted:Failed to get minimum memory reservation of 452.19 MB on 
daemon impala-ec2-centos74-m5-4xlarge-ondemand-191d.vpc.cloudera.com:27002 for 
query 394b7f96d554f99c:6882496c00000000 due to following error: Failed to 
increase reservation by 452.19 MB because it would exceed the applicable 
reservation limit for the "Process" ReservationTracker: reservation_limit=10.20 
GB reservation=9.91 GB used_reservation=0 child_reservations=9.91 GB
E   The top 5 queries that allocated memory under this tracker are:
E   Query(fa4ece9474a3f865:1b284e6700000000): Reservation=9.60 GB 
ReservationLimit=9.60 GB OtherMemory=118.01 MB Total=9.71 GB Peak=9.71 GB
E   Query(534d07950247ae68:6f5a410d00000000): Reservation=123.50 MB 
ReservationLimit=9.60 GB OtherMemory=2.68 MB Total=126.18 MB Peak=317.02 MB
E   Query(2e4f087aa8263e23:e697d8e800000000): Reservation=50.81 MB 
ReservationLimit=9.60 GB OtherMemory=42.62 MB Total=93.43 MB Peak=173.74 MB
E   Query(6e459d892dfa5050:5959219b00000000): Reservation=28.88 MB 
ReservationLimit=9.60 GB OtherMemory=18.77 MB Total=47.64 MB Peak=53.11 MB
E   Query(ad455bea2e0adc64:2b0bbf3500000000): Reservation=17.94 MB 
ReservationLimit=9.60 GB OtherMemory=15.22 MB Total=33.16 MB Peak=163.99 MB
E   
E   
E   
E   
E   
E   Memory is likely oversubscribed. Reducing query concurrency or configuring 
admission control may help avoid this error.
{noformat}

Logs confirmed that fa4ece9474a3f865:1b284e6700000000 is the query id of the 
query that run_and_verify_query_cancellation_test ran.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to