Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16369 )

Change subject: IMPALA-9636: Don't run retried query on the blacklisted nodes
......................................................................


Patch Set 12:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16369/8/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/16369/8/be/src/scheduling/admission-controller.cc@1120
PS8, Line 1120:       cluster_membership_mgr_->GetSnapshot();
              :   DCHECK(membership_snapshot.get() != nullptr);
              :   string blacklist_str = 
membership_snapshot->executor_blacklist.BlacklistToString();
              :   if (!blacklist_str.empty()) {
              :
> why would the original attempt fail due to a timeout instead of an RPC erro
I am wrong. The unit test failed due to timeout since I did not set enough 
waiting time when adding delay for both original and retried queries. I 
adjusted the heart-beat interval, sleep time with CRS_BEFORE_ADMISSION and 
total waiting time for the query, the new unit-test is passed. Removed this 
debug action, and use CRS_BEFORE_ADMISSION instead.


http://gerrit.cloudera.org:8080/#/c/16369/11/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/16369/11/be/src/scheduling/admission-controller.cc@1579
PS11, Line 1579:     DCHECK(!group_state->executor_group().empty());
> might want to add "LOG(WARNING) << queue_node->not_admitted_reason;" as wel
Added LOG message.


http://gerrit.cloudera.org:8080/#/c/16369/11/be/src/scheduling/admission-controller.cc@1575
PS11, Line 1575:     Status ret =
               :         
ExecEnv::GetInstance()->scheduler()->Schedule(group_config, group_state.get());
               :     if (created) delete filtered_executor_group;
               :     if (UNLIKELY(!ret.ok())) return ret;
               :     DCHECK(!group_state->executor_group().empty());
               :     output_schedules->emplace_back(std::move(group_state), 
*executor_group);
               :   }
> couldn't you do this before calling scheduler()->Schedule(...)? if num_exec
Right, move the code before schedule() is called.



--
To view, visit http://gerrit.cloudera.org:8080/16369
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I00bc1b5026efbd0670ffbe57bcebc457d34cb105
Gerrit-Change-Number: 16369
Gerrit-PatchSet: 12
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Fri, 11 Sep 2020 14:12:39 +0000
Gerrit-HasComments: Yes

Reply via email to