Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/10060 )
Change subject: IMPALA-5216: Make admission control queuing async ...................................................................... Patch Set 7: (40 comments) http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@44 PS7, Line 44: ClientRequestState::admit_outcome_ > if this enum is meant to be consumed publicly, then it's best to not refer Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@201 PS7, Line 201: admit_status > admit_outcome? Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@203 PS7, Line 203: cancelled > given that there's no Cancel() API in this class, how is that done? (I'm gu Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@204 PS7, Line 204: admit_status > admit_outcome Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.cc@112 PS7, Line 112: Q > nit: don't capitalize queue Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.cc@899 PS7, Line 899: false; > use && rather than ?:false Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@72 PS7, Line 72: /// Must *not* be called with lock_ held. > given that the coordinator object is (unfortunately) current exposed by thi Changed coord() to GetCoord() and added those details to its comment instead. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@114 PS7, Line 114: non-error states (RUNNING_STATE and FINISHED_STATE) > is PENDING_STATE one of those? added it to this method, and updated the implementation to honor the state transition order of: INITIALIZED_STATE -> PENDING_STATE -> RUNNING_STATE -> FINISHED_STATE http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@152 PS7, Line 152: coord_ > let's try not to talk about private members when documenting the public int Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@172 PS7, Line 172: Coordinator* coord() const { : return coord_exec_started_.Load() ? coord_.get() : nullptr; : } > it's confusing that coord() is not actually an accessor given it's name. If Renamed to GetCoord(). For the second part of your comment: That was my initial thought as well, but what happens is that there is a small window between calling coord->Exec() and setting the "coord_" variable in CRS, where the started fragments send in UpdateBackendExecStatus RPCs which we need to pass to the coord object that we currently do by looking up the CRS object directly in "client_request_state_map_". This worked up till now since coord_ is set before the Exec() call but now if we defer setting it, then we get a nullptr instead. I thought of 2 approaches to to solve this: 1. the one that I implemented using "coord_exec_started_" for controlling external access and adding methods CRS::UpdateBackendExecStatus(), CRS::UpdateFilter() that access the "coord_" object directly. OR 2. To defer setting the "coord_" variable in CRS and to serve the UpdateBackendExecStatus and UpdateFilter RPCs by keeping a list of active coordinators separately and looking them up there. Update: As discussed, I will try to look into a way of cleaning this up. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@323 PS7, Line 323: Exec() has been successfully called on it > why do we bother creating the coordinator object ahead of when we call Exec see previous comment. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@481 PS7, Line 481: query-exec-state > Is there a reason we can't update the group name to client-request-state? I I dont see any reason not to. Since this only needs to be changed in 2 other places apart from this (seems like a small enough change), would you recommend I make the change in this commit itself? http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@482 PS7, Line 482: SubmitToAdmissionController, this, : &admission_controller_thread_, > I think it's a bit confusing that we call this the "admission controller" t Excellent idea. Done. I am a bit iffy about calling it the "exec-thread" since its not exactly executing the query. But I dont have a better suggestion, since anything else might be a bit verbose. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@671 PS7, Line 671: if (coord() != NULL) { : RETURN_IF_ERROR(coord()->Wait()); : RETURN_IF_ERROR(UpdateCatalog()); : } > Oh, I guess that's what the Join() is for on line 660. Could you add a comm Yes thats right. Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@734 PS7, Line 734: : && TOperationState::INITIALIZED_STATE < operation_state) > Ahh I missed that this was implied by the DCHECK on my first pass. Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@897 PS7, Line 897: if (uses_admission_control()) admit_outcome_.TrySet(AdmissionOutcome::CANCELLED); > Does this somehow cause the entry in the AC queue to be removed proactively Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@1148 PS7, Line 1148: if (!UpdateQueryStatus(admit_status).ok()) return; > in the case that cancel happened while inside AdmitQuery(), AdmitQuery() ca The reason behind setting a cancelled status is that if the wait thread has started up, then it can return with an error status (a state expected by FetchResults() so that it does not try to call GetNext()). Earlier for this case, it used to return because of a cancelled status returned by coord_->Wait(), but with this patch it needs to identify a cancelled state in case we dont even start up the coord_ and exit early. As discussed, I will change the cancellation path in an upcoming commit for IMPALA-1262 http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@1155 PS7, Line 1155: // set it to CANCELLED. > isn't that a user visible behavior change? (though it's probably okay). I totally agree about cleanup. Also, please see previous comment for why I set it to cancelled here. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@1147 PS7, Line 1147: lock_guard<mutex> l(lock_); : if (!UpdateQueryStatus(admit_status).ok()) return; : if (is_cancelled_) { : VLOG_QUERY << "Cancelled right after successful admission, query id=" : << schedule_->query_id(); : // If query cancelled after being admitted, release Admission control resources. : exec_env_->admission_controller()->ReleaseQuery(*schedule_.get()); : // If cancellation was user initiated, then query_status would be OK, therefore : // set it to CANCELLED. : discard_result(UpdateQueryStatus(Status::CANCELLED)); : return; : } > After reading the admission controller changes, I see that this window is t Yes this is for the small window where cancellation is initiated after the query is successfully admitted (admit_status is OK). Initially I was on the fence between choosing optimization(avoiding coord:exec/cancel) vs simplifying cancellation logic, but decided to choose optimization instead, since the optimization can be significant if the query has a decent number of fragments to start. I dont really have a strong preference either way, so open to suggestions. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@1171 PS7, Line 1171: lock_guard<mutex> l(lock_); : if (!UpdateQueryStatus(exec_status).ok()) return; : cancelled = is_cancelled_; : if (!cancelled) { : // This needs to be done while holding lock_ as it will ensure that coord_ is : // visible to the cancellation thread if the query is marked cancelled after exiting : // this critical section. : coord_exec_started_.Store(true); : } else { : VLOG_QUERY << "Cancelled right after starting the coordinator query id=" : << schedule_->query_id(); : discard_result(UpdateQueryStatus(Status::CANCELLED)); : // If there was already a non OK query status before we set it to CANCELLED then : // take it to update coord_. : cancellation_status = query_status(); : } : } > As alluded to earlier, I think we might be able to simplfiying this by not please see my reply at client-request-state.h:174 http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-hs2-server.cc File be/src/service/impala-hs2-server.cc: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-hs2-server.cc@448 PS7, Line 448: DCHECK(status.ok() || !request_state->uses_admission_control()); > why is that the case, and why does this code care? outdated code, forgot to remove it. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-hs2-server.cc@451 PS7, Line 451: if (!request_state->uses_admission_control()) { : request_state->UpdateNonErrorOperationState(TOperationState::RUNNING_STATE); : } > maybe it'd be clearer to tuck this into CRS::Execute(). That way, this cod Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-hs2-server.cc@464 PS7, Line 464: if (!status.ok()) goto return_error; > if we hit an error here, how does the WaitAsync thread finish? when we goto return_error, it calls UnregisterQuery which then calls CancelInternal (which would cause both wait and exec thread(formally admission controller thread) to eventually cancel and exit) and then calls request_state->Done() that ensured that both those threads exit before the query is finally closed and removed. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-http-handler.cc@769 PS7, Line 769: == nullptr > not your change, and let's not fix it here, but given that the coord_ ptr i request_state->coord() would be safe in this case since the coord_ object is only accessible through coord() if coord_exec_started_ is true, which is set atomically and while holding lock_. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-http-handler.cc@769 PS7, Line 769: uses_admission_control( > should that check for PENDING_STATE instead, to be more specific? we need this so that an error is not returned for queries that were cancelled before coord was started and have not been closed/unregistered yet. Or we can check for PENDING/EXCEPTION state instead. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-http-handler.cc@779 PS7, Line 779: request_state->coord()->GetTExecSummary(&summary); > I guess PrintExecSummary() is okay with an empty summary? have you tested t Yes i tested an empty summary with PrintExecSummary , it will just return an empty string. The summary object is initialized at L746 and in PrintExecSummary it checks for "if (!exec_summary.__isset.nodes) return "";" http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-server.h@432 PS7, Line 432: It represents the set of queries that : /// have either started execution or are queued and can be cancelled. > if this is accurate, maybe say instead: Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-server.h@434 PS7, Line 434: submitted_queries > alternatively, if we clean up the state machine and how cancel works, it se makes sense. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-server.h@499 PS7, Line 499: changes when the query : /// is set in-flight. > preexisting, but I'm not really sure what this is saying. I saw this as an outdated comment, and tried to make sense of it by looking at when it actually changes state, it looks like having this is more confusing, so better get rid of it. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-server.cc@a910 PS7, Line 910: > is it the right thing to delay this? by delaying it it means that that the makes sense. will move it back and updated comment for query_locations_ http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-server.cc@663 PS7, Line 663: return Status::OK(); > What does this look like in the various places that get the summary? Should This method is only called by the hs2 and beeswax's GetExecSummary api, so it totally depends on how the client handles it. I found only one e2e test trying to fetch the exec summary that his this case. My initial implementation was to return an error message, but while fixing that test case, as a user, returning an error made more sense only when there actually is an error (query not found) and in this case an empty result worked well. Dont have a strong preference for either, open to suggestions. http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-server.cc@875 PS7, Line 875: if (FLAGS_stress_metadata_loading_pause_injection_ms > 0) { : SleepForMs(FLAGS_stress_metadata_loading_pause_injection_ms); : } > this should probably use your new debug action thingy, but okay to do that will create a JIRA for this after this patch goes in http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/util/debug-util.h File be/src/util/debug-util.h: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/util/debug-util.h@112 PS7, Line 112: sleep_option:sleep_time_ms > since we usually use single quotes to document function args, this might be Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/util/debug-util.h@112 PS7, Line 112: "sleep_option" > and then you could change: Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/util/debug-util.h@114 PS7, Line 114: 'sleep_time_ms > then also make this: <sleep_time_ms> Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/util/debug-util.h@115 PS7, Line 115: string > const string& Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/util/promise.h File be/src/util/promise.h: http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/util/promise.h@73 PS7, Line 73: ( > nit: missing space Done http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/util/promise.h@77 PS7, Line 77: return val_; > given the subtlty of the lock as commented above, maybe combine these into hmmm, you are right but this would be similar to how its implemented right now.... how about we make it even more stricter by creating a default template parameter for the promise class, like this enum class PromiseType { SINGLE_PRODUCER, MULTIPLE_PRODUCER }; template <typename T, PromiseType = PromiseType::SINGLE_PRODUCER> class Promise { ... ... } http://gerrit.cloudera.org:8080/#/c/10060/7/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/10060/7/common/thrift/ImpalaService.thrift@90 PS7, Line 90: "sleep_option:sleep_time_ms", > to keep with the a consistent syntax as #1, I think this would be: Done http://gerrit.cloudera.org:8080/#/c/10060/7/tests/hs2/test_hs2.py File tests/hs2/test_hs2.py: http://gerrit.cloudera.org:8080/#/c/10060/7/tests/hs2/test_hs2.py@468 PS7, Line 468: # ExecSummary. > We should also test getting the exec summary after the query was admitted b Done -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 7 Gerrit-Owner: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Fri, 11 May 2018 22:45:32 +0000 Gerrit-HasComments: Yes