Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14104 )

Change subject: IMPALA-8803: Coordinator should release admitted memory 
per-backend
......................................................................


Patch Set 1:

(19 comments)

Took a pass over the code, havent looked at the tests yet.
Looks good, mostly nits

http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.h@63
PS1, Line 63:       const BackendExecParams& exec_params);
nit: we can get the query_id and BackendExecParams from the QuerySchedule


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.h@403
PS1, Line 403: UNRELEASED
> Will "IN_USE" be slightly clear ?
+1


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.h@413
PS1, Line 413:  are considered to
nit: transition to?


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.h@469
PS1, Line 469: BackendFinished
nit: MarkBackendFinished


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.h@470
PS1, Line 470: std::vector<BackendState*>* releasable_backend_states
nit: I usually try to avoid (input,output) type params if possible. Maybe 
consider adding a GetReleasableBackends() method.


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.h@474
PS1, Line 474: released_backend_states
nit: how about backends_to_release?


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.h@474
PS1, Line 474: BackendsReleased
nit: how about ReleaseBackends?


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.h@477
PS1, Line 477: Close
nit: how about call this CloseAndGetUnreleasedBackends() then return the 
unreleased_backend_states instead of having an output param.


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.h@492
PS1, Line 492:   enum ResourceState { UNRELEASED, PENDING, RELEASABLE, 
RELEASED};
nit: space


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.cc@827
PS1, Line 827: backend_states_.size()
nit: use num_backends_ here and below


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.cc@827
PS1, Line 827: last_backend
nit: how about is_last_unrelease_backend


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.cc@831
PS1, Line 831: coordinator_last_backend
nit: how about is_coord_the_last_unreleased_backend


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.cc@836
PS1, Line 836: released_timer_.ElapsedTime() > RELEASE_BACKEND_STATES_DELAY / 
1000000
should be released_timer_.ElapsedTime() > RELEASE_BACKEND_STATES_DELAY * 1000000

since released_timer_.ElapsedTime() is nano-sec and 
RELEASE_BACKEND_STATES_DELAY is milli-sec

Alternatively we can store RELEASE_BACKEND_STATES_DELAY in nano-sec to avoid 
the multiplication everytime.


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.cc@865
PS1, Line 865: are
nit: as


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator.h@530
PS1, Line 530:
nit: needs comment


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator.cc@764
PS1, Line 764:     vector<BackendState*> releasable_backends;
             :     backend_resource_state_->BackendFinished(backend_state, 
&releasable_backends);
             :     if (!releasable_backends.empty()) {
             :       
ReleaseBackendAdmissionControlResources(releasable_backends);
             :       
backend_resource_state_->BackendsReleased(releasable_backends);
             :     }
is it possible that both this thread and a cancellation thread call 
ReleaseBackendAdmissionControlResources on the same backend?
eg. BackendFinished() returns a set of releasable_backends here and the query 
got cancelled so that thread called Close() and got a set of all unrelease 
backends, then both called ReleaseBackendAdmissionControlResources()


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/scheduling/admission-controller.cc@367
PS1, Line 367: for (auto host_addr : host_addrs) {
             :     auto backend_exec_params = 
schedule.per_backend_exec_params().find(host_addr);
             :     if (backend_exec_params == schedule.per_
nit: we can probably calculate this in UpdateHostStatsForQueryBackends and 
directly pass the mem to release.


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/scheduling/admission-controller.cc@410
PS1, Line 410: UpdateHostStatsForQueryBackends
maybe worth mentioning in its comment that this only used to release the query 
and hence decrements the stats


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/scheduling/admission-controller.cc@420
PS1, Line 420:     int64_t mem_to_release = 
backend_exec_params->second.is_coord_backend ?
             :         schedule.coord_backend_mem_to_admit() :
             :         per_backend_mem_to_admit;
             :     const string host = TNetworkAddressToString(host_addr);
             :     VLOG_ROW << "Update admitted mem reserved for host=" << host
             :              << " prev=" << 
PrintBytes(host_stats_[host].mem_admitted)
             :              << " new=" << 
PrintBytes(host_stats_[host].mem_admitted - mem_to_release);
             :     host_stats_[host].mem_admitted -= mem_to_release;
             :     DCHECK_GE(host_stats_[host].mem_admitted, 0);
             :     VLOG_ROW << "Update admitted queries for host=" << host
             :              << " prev=" << host_stats_[host].num_admitted
             :              << " new=" << host_stats_[host].num_admitted - 1;
             :     host_stats_[host].num_admitted -= 1;
             :     DCHECK_GE(host_stats_[host].num_admitted, 0);
nit: this is the same case as in UpdateHostStats() maybe refactor it out and 
resuse



--
To view, visit http://gerrit.cloudera.org:8080/14104
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88bb11e0ede7574568020e0277dd8ac8d2586dc9
Gerrit-Change-Number: 14104
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar <stak...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Aug 2019 01:43:01 +0000
Gerrit-HasComments: Yes

Reply via email to