Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16763 )

Change subject: IMPALA-10258, IMPALA-10109: Fixed flaky test in 
test_query_retries.py
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16763/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16763/2//COMMIT_MSG@7
PS2, Line 7: IMPALA-10258, IMPALA-10109
since these issues are basically unrelated, could you separate them out into 
two reviews?


http://gerrit.cloudera.org:8080/#/c/16763/2//COMMIT_MSG@9
PS2, Line 9: When TestQueryRetries.test_original_query_cancel was ran on s3
I'm not sure I understand what you're saying the issue is:

According to the JIRA, the test was waiting for the query to reach state 
"RUNNING", but it was already at state "EXCEPTION" (QueryState = 5, see 
beeswax.thrift). At that point in the test, the query shouldn't have failed, 
since the impalad hasn't been killed yet, so really not sure what could have 
happened, and unfortunately it doesn't look like we have the logs for it.


http://gerrit.cloudera.org:8080/#/c/16763/2//COMMIT_MSG@16
PS2, Line 16: For IMPALA-10109, test_retries_from_cancellation_pool did not
I'm not sure I understand what you're saying the issue is:

According to the JIRA, the query timed out after ~784s, which is a lot longer 
than the default statestore time-to-detect-failure of heartbeat_frequency x 
max_missed = 1000ms x 10 = 10s. So it seems like the coordinator should have 
had plenty of time to get the statestore message, even under the old values.

Looking through the logs, I'm a little confused by what I see - the coordinator 
says the query was only scheduled on 2 backends, but I think the test assumes 
that it gets scheduled on all 3 backends in the minicluster (see 
__kill_random_impalad()). I also see a reference to CancelFromThreadPool in 
QueryExecMgr on impalad_node1, but that should be hit unless the coordinator is 
killed, which it shouldn't have been.



--
To view, visit http://gerrit.cloudera.org:8080/16763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib89f7b01a0f2a66a97f312e779a4ab04f4f347f3
Gerrit-Change-Number: 16763
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Nov 2020 20:36:46 +0000
Gerrit-HasComments: Yes

Reply via email to