Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13207 )

Change subject: IMPALA-8460: Simplify cluster membership management
......................................................................


Patch Set 7:

(3 comments)

Had a few comments, still need to complete review

http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/runtime/exec-env.cc@454
PS5, Line 454:   impala_server_ = server;
             :   // Register the ImpalaServer with the cluster membership 
manager
             :   cluster_membership_mgr_->SetLocalBeDescFn([server]() {
             :     return server->GetLocalBackendDescriptor();
             :   });
             :   cluster_membership_mgr_->SetUpdateLocalServerFn(
             :       [server](const ClusterMembershipMgr::BackendAddressSet& 
current_backends) {
             :
> Thanks the DCHECK would definitely help. I was also just thinking of a situ
We shouldn't be running ~ExecEnv outside of backend tests AFAIK.


http://gerrit.cloudera.org:8080/#/c/13207/6/be/src/scheduling/cluster-membership-mgr.h
File be/src/scheduling/cluster-membership-mgr.h:

http://gerrit.cloudera.org:8080/#/c/13207/6/be/src/scheduling/cluster-membership-mgr.h@125
PS6, Line 125:   /// membership. This callback will only be called when 
backends are deleted from the
Unfortunately I don't think these Set*Fn() functions are really thread-safe, 
and it looks like Init() registers ClusterMembershipMgr with the statestore 
subscriber, so it's possible that a statestore update could race with these 
functions getting updated.

It's probably unlikely to lead to a bug so long as these functions are always 
set from non-NULL to NULL, but it hard to reason about given that std::function 
isn't just a simple pointer, so the the reader could see a "torn" state. I 
think it would also trip TSAN, etc.

I think we could solve it two ways:
* Require that all the callbacks are registered before Init() is called.
* Add some synchronisations around the function pointers. I don't think any of 
this is on a critical path, so I wouldn't be too concerned about a simple 
SpinLock.

I think the first would be cleaner, although I understand there are some 
initialisation order issues between ExecEnv and ImpalaServer. Maybe we can 
defer initialisation of ClusterMembershipMgr until after SetImpalaServer() is 
called or something like that?


http://gerrit.cloudera.org:8080/#/c/13207/6/be/src/scheduling/cluster-membership-mgr.cc
File be/src/scheduling/cluster-membership-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/13207/6/be/src/scheduling/cluster-membership-mgr.cc@54
PS6, Line 54: void 
ClusterMembershipMgr::SetLocalBeDescFn(BackendDescriptorPtrFn fn) {
Maybe DCHECK that it's non-NULL here and in the other functions below.



--
To view, visit http://gerrit.cloudera.org:8080/13207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3
Gerrit-Change-Number: 13207
Gerrit-PatchSet: 7
Gerrit-Owner: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Wed, 08 May 2019 20:51:40 +0000
Gerrit-HasComments: Yes

Reply via email to