Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13207 )
Change subject: IMPALA-8460: Simplify cluster membership management ...................................................................... Patch Set 3: (25 comments) I did an initial quick pass, focused mainly on the interfaces and overall code structure. I haven't made a concerted effort to think about correctness and/or look for bugs yet but I thought it was worth flushing out the feedback at this point. http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.h File be/src/scheduling/cluster-membership-mgr.h: http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.h@96 PS2, Line 96: /// A callback to notify the local ImpalaServer of changes to the cluster membership. Should we document that no locks are held during by the calling thread of the callback? That's the trickiest problem I've generally seen with callbacks and it's not an issue here. http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.cc File be/src/scheduling/cluster-membership-mgr.cc: http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.cc@94 PS2, Line 94: not nit: no http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.cc@98 PS2, Line 98: VLOG(3) << "Processing statestore update"; I feel like a lot of these messages could actually be VLOG(2) or even VLOG(1), since they're relatively rare and mean *something* happened, like a cluster membership change or this impalad registering itself. Don't need to fix it right now, but worth thinking about if it would help debugging in future. http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.cc@108 PS2, Line 108: // Make a copy. Currently no other code would modify the state but we take the lock as We're assuming that only one thread executes the below code at a time, right, otherwise they'd race to update the sate. Can we use a utility to verify this - I believe DFAKE_SCOPED_LOCK_THREAD_LOCKED from gutil does this, but I'm just going on Todd's advice. Ok to defer but would be nice to clarify invariants as part of refactor. http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.cc@113 PS2, Line 113: if (local_be_desc.get() != nullptr) { Conditional fits on one line? http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.cc@145 PS2, Line 145: VLOG(2) << "Error deserializing membership topic item with key: " << item.key; This seems like something we should definitely log at INFO or probably even WARNING level, since something went wrong! http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.cc@152 PS2, Line 152: VLOG(1) << "Ignoring subscription request with empty IP address from subscriber: " Same here http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.cc@201 PS2, Line 201: desccriptor typo http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.cc@242 PS2, Line 242: LOG(WARNING) << "Failed to serialize Impala backend descriptor for statestore topic:" It seems like this really should be a LOG(FATAL) or a CHECK since we never would expect this to happen and an Impalad that can't register itself with the SS is useless. OK to ignore to avoid scope creep of refactor. http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.cc@252 PS2, Line 252: BackendAddressSet current_backend_set; This is O(n^2) where n in the cluster size, in some sense because we potentially do O(n) work for each of O(n) registrations. I think this is probably OK though right? Because generally the membership changes won't be continual and this is fairly cheap. I do feel like this is less error prone than trying to only send deltas, so I think I'm in favour. http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.cc@282 PS2, Line 282: const BackendDescriptorPtr& local_be_desc){ nit: whitespace http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.cc@283 PS2, Line 283: if (local_be_desc.get() == nullptr) return false; I don't personally like the redundant .get() here but it is a matter of taste :) http://gerrit.cloudera.org:8080/#/c/13207/2/be/src/scheduling/cluster-membership-mgr.cc@312 PS2, Line 312: << " is shutting down"; return false here and on l308? http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/scheduling/executor-group.h File be/src/scheduling/executor-group.h: http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/scheduling/executor-group.h@38 PS3, Line 38: ExecutorGroup(const std::vector<TNetworkAddress>& executors); Explicit constructors? http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/scheduling/executor-group.h@41 PS3, Line 41: typedef std::list<TBackendDescriptor> ExecutorList; Any reason this shouldn't be a vector? That's generally a better default. It doesn't look like we need stable iterators into a list, I don't think and I don't see anywhere where the asymptotic complexity is better. http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/scheduling/executor-group.h@43 PS3, Line 43: /// Return the list of executors on a particular host. The caller must make sure that What's the thread safety story for the returned list? http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/scheduling/executor-group.h@47 PS3, Line 47: void GetAllExecutorIps(std::vector<IpAddr>* ip_addresses) const; Can you document whether these append to the vectors or replace the contents. And if it replaces the contents, why not just return a new vector? http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/scheduling/executor-group.h@49 PS3, Line 49: void AddExecutor(const TBackendDescriptor& be_desc); I think these could do with a basic description. There may be some details worth documenting, e.g. what happens if the executor already exists in the group. http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/scheduling/executor-group.h@62 PS3, Line 62: /// lifetime of this ExecutorGroup. What happens if the executor is removed with RemoveExecutor()? http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/scheduling/executor-group.h@65 PS3, Line 65: const HashRing* GetHashRing() const { return &executor_ip_hash_ring_; } Maybe worth documenting lifetime/thread safety of this? http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/scheduling/executor-group.h@71 PS3, Line 71: typedef boost::unordered_map<IpAddr, ExecutorList> ExecutorMap; Maybe a good time to switch these to std::unordered_map? http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/scheduling/executor-group.cc File be/src/scheduling/executor-group.cc: http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/scheduling/executor-group.cc@54 PS3, Line 54: ip_addresses->reserve(NumExecutors()); This isn't right if ip_addresses already has entries. Not sure if this happens in practice - if it doesn't though we should just return the vector I think. http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/scheduling/executor-group.cc@109 PS3, Line 109: LIKELY LIKELY annotation probably isn't necessary. Ok to leave. http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/scheduling/scheduler.h File be/src/scheduling/scheduler.h: PS3: It's nice that the scheduler has become more focused on scheduling and less stateful. http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/13207/3/be/src/service/impala-server.cc@a1744 PS3, Line 1744: :) Nice to see this file get slightly simpler -- To view, visit http://gerrit.cloudera.org:8080/13207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3 Gerrit-Change-Number: 13207 Gerrit-PatchSet: 3 Gerrit-Owner: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Thu, 02 May 2019 17:26:49 +0000 Gerrit-HasComments: Yes