Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12299 )

Change subject: IMPALA-2990: timeout unresponsive queries in coordinator
......................................................................


Patch Set 1:

(5 comments)

Cool.

http://gerrit.cloudera.org:8080/#/c/12299/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12299/1//COMMIT_MSG@29
PS1, Line 29: - Write functional tests once the appropriate mechanisms are in 
place
I think you can "kill -STOP" an impalad running a query, and it will end up 
looking like this to the coordinator. Of course, maybe something else will 
timeout.


http://gerrit.cloudera.org:8080/#/c/12299/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/12299/1/be/src/runtime/coordinator.cc@213
PS1, Line 213:   backend_states_init_ = true;
This could probably just use backend_states_.empty() instead. I don't mind the 
extra variable, though. We have .empty() usages elsewhere.


http://gerrit.cloudera.org:8080/#/c/12299/1/be/src/runtime/coordinator.cc@744
PS1, Line 744:   int64_t current_time = MonotonicMillis();
It's confusing, as be/src/runtime/coordinator-backend-state.h has similarly 
named variables that are in UnixMillis() and MonotonicMillis(). I think this is 
correct at the moment, which is good.


http://gerrit.cloudera.org:8080/#/c/12299/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/12299/1/be/src/service/impala-server.cc@2212
PS1, Line 2212: true
Do we have a style to write
TerminatedByServer(...., true /* expire */)
?


http://gerrit.cloudera.org:8080/#/c/12299/1/be/src/service/impala-server.cc@2209
PS1, Line 2209:             int64_t lag_time_ms = 
coord->GetMaxBackendStateLagMs();
              :             if (lag_time_ms > FLAGS_max_report_lag_s * 1000) {
              :               
cancellation_thread_pool_->Offer(CancellationWork::TerminatedByServer(
              :                   request_state->query_id(), Status("hung 
query"), true));
If I were a user receiving this error, I'd immediately want to know which 
backend failed. Let's include more information here, including the values of 
the flags, the max lag, and where the max lag was observed.



--
To view, visit http://gerrit.cloudera.org:8080/12299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I196c8c6a5633b1960e2c3a3884777be9b3824987
Gerrit-Change-Number: 12299
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall <thomasmarsh...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com>
Gerrit-Comment-Date: Tue, 29 Jan 2019 23:31:01 +0000
Gerrit-HasComments: Yes

Reply via email to