Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/12049 )
Change subject: IMPALA-4555: Make QueryState's status reporting more robust ...................................................................... Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/query-state.cc@58 PS1, Line 58: DEFINE_int32(status_report_max_failures, 3, : "Max number of consecutive failed status reports to allow before cancelling"); > I thought we want to use a fixed timeout approach for the maximum retries ? Actually, max_retries seems to be safer in the sense that it guarantees a minimum amount of time the thread will sleep before giving up as we know the sleep time between each retry. With an absolute timeout, there is no guarantee on the number of retries we will do. If the system is overloaded, the query state thread may not get to run very often before expiration so the number of retries is non-deterministic. http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/query-state.cc@368 PS1, Line 368: fis_map_[id]->runtime_state()->ClearUnreportedErrors(); There is a race here: we may be clearing newly added errors we added after the profile was computed. -- To view, visit http://gerrit.cloudera.org:8080/12049 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib6007013fc2c9e8eeba11b752ee58fb3038da971 Gerrit-Change-Number: 12049 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Marshall <thomasmarsh...@cmu.edu> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Comment-Date: Tue, 11 Dec 2018 22:25:25 +0000 Gerrit-HasComments: Yes