[ https://issues.apache.org/jira/browse/IMPALA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lars Volker updated IMPALA-7931: -------------------------------- Description: On a recent S3 test run test_shutdown_executor hit a timeout waiting for a query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION). {noformat} 12:51:11 __________________ TestShutdownCommand.test_shutdown_executor __________________ 12:51:11 custom_cluster/test_restart_services.py:209: in test_shutdown_executor 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, before_shutdown_handle) == 3 12:51:11 custom_cluster/test_restart_services.py:356: in __fetch_and_get_num_backends 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20) 12:51:11 common/impala_service.py:267: in wait_for_query_state 12:51:11 target_state, query_state) 12:51:11 E AssertionError: Did not reach query state in time target=4 actual=5 {noformat} >From the logs I can see that the query fails because one of the executors >becomes unreachable: {noformat} I1204 12:31:39.954125 5609 impala-server.cc:1792] Query a34c3a84775e5599:b2b25eb900000000: Failed due to unreachable impalad(s): jenkins-worker:22001 {noformat} The query was {{select count\(*) from functional_parquet.alltypes where sleep(1) = bool_col}}. It seems that the query took longer than expected and was still running when the executor shut down. was: On a recent S3 test run test_shutdown_executor hit a timeout waiting for a query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION). {noformat} 12:51:11 __________________ TestShutdownCommand.test_shutdown_executor __________________ 12:51:11 custom_cluster/test_restart_services.py:209: in test_shutdown_executor 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, before_shutdown_handle) == 3 12:51:11 custom_cluster/test_restart_services.py:356: in __fetch_and_get_num_backends 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20) 12:51:11 common/impala_service.py:267: in wait_for_query_state 12:51:11 target_state, query_state) 12:51:11 E AssertionError: Did not reach query state in time target=4 actual=5 {noformat} >From the logs I can see that the query fails because one of the executors >becomes unreachable: {noformat} I1204 12:31:39.954125 5609 impala-server.cc:1792] Query a34c3a84775e5599:b2b25eb900000000: Failed due to unreachable impalad(s): jenkins-worker:22001 {noformat} The query was {{select count(*) from functional_parquet.alltypes where sleep(1) = bool_col}}. It seems that the query took longer than expected and was still running when the executor shut down. > test_shutdown_executor fails with timeout waiting for query target state > ------------------------------------------------------------------------ > > Key: IMPALA-7931 > URL: https://issues.apache.org/jira/browse/IMPALA-7931 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure > Affects Versions: Impala 3.2.0 > Reporter: Lars Volker > Priority: Critical > Labels: broken-build > > On a recent S3 test run test_shutdown_executor hit a timeout waiting for a > query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION). > {noformat} > 12:51:11 __________________ TestShutdownCommand.test_shutdown_executor > __________________ > 12:51:11 custom_cluster/test_restart_services.py:209: in > test_shutdown_executor > 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, > before_shutdown_handle) == 3 > 12:51:11 custom_cluster/test_restart_services.py:356: in > __fetch_and_get_num_backends > 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20) > 12:51:11 common/impala_service.py:267: in wait_for_query_state > 12:51:11 target_state, query_state) > 12:51:11 E AssertionError: Did not reach query state in time target=4 > actual=5 > {noformat} > From the logs I can see that the query fails because one of the executors > becomes unreachable: > {noformat} > I1204 12:31:39.954125 5609 impala-server.cc:1792] Query > a34c3a84775e5599:b2b25eb900000000: Failed due to unreachable impalad(s): > jenkins-worker:22001 > {noformat} > The query was {{select count\(*) from functional_parquet.alltypes where > sleep(1) = bool_col}}. > It seems that the query took longer than expected and was still running when > the executor shut down. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org