Wenzhe Zhou has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/16763 )
Change subject: IMPALA-10258, IMPALA-10109: Fixed flaky test in test_query_retries.py ...................................................................... IMPALA-10258, IMPALA-10109: Fixed flaky test in test_query_retries.py When TestQueryRetries.test_original_query_cancel was ran on s3 with query option spool_query_results enabled, the query was timeout before reaching the expected state. This patch double the timeout for the query when the test is running on S3 and double the timeout for query to reaching "FINISHED" state. For IMPALA-10109, test_retries_from_cancellation_pool did not trigger query-retry when one of impalad was killed. It seems that membership updating message was not received and processed by coordinator before reaching terminated state, hence the query-retry was not triggered. This patch reduce the heartbeat_frequency and max_missed_heartbeats so that statestore will take much less time to update membership when one impalad was killed so that coordinator could start query-retry. Testing: - Ran the two tests in a loop for more than 3 hours. The test failures did not happen. Change-Id: Ib89f7b01a0f2a66a97f312e779a4ab04f4f347f3 --- M tests/custom_cluster/test_query_retries.py 1 file changed, 12 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/16763/2 -- To view, visit http://gerrit.cloudera.org:8080/16763 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib89f7b01a0f2a66a97f312e779a4ab04f4f347f3 Gerrit-Change-Number: 16763 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>