Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/20276 )
Change subject: IMPALA-12321: Fix the race condition for updating active catalogd ...................................................................... Patch Set 2: Code-Review+1 (7 comments) Thank you Wenzhe for addressing my comments. The patch looks good to me. I only have 2 more nits if that's OK. http://gerrit.cloudera.org:8080/#/c/20276/1/be/src/catalog/catalog-server.h File be/src/catalog/catalog-server.h: http://gerrit.cloudera.org:8080/#/c/20276/1/be/src/catalog/catalog-server.h@171 PS1, Line 171: bool last_update_for_registration_ = false; > Should be used in CatalogServer::UpdateActiveCatalogd(). Fixed. Done http://gerrit.cloudera.org:8080/#/c/20276/2/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/20276/2/be/src/catalog/catalog-server.cc@559 PS2, Line 559: LOG(INFO) << "This catalogd instance is changed to inactive status"; nit: I think it is useful to also print information about last_active_catalogd_version_ and the new active CatalogD's hostname:port when this become inactive. http://gerrit.cloudera.org:8080/#/c/20276/1/be/src/runtime/exec-env.h File be/src/runtime/exec-env.h: http://gerrit.cloudera.org:8080/#/c/20276/1/be/src/runtime/exec-env.h@330 PS1, Line 330: /// True if the last update of active catal > Thanks to catch this. Done http://gerrit.cloudera.org:8080/#/c/20276/1/be/src/statestore/statestore-catalogd-mgr.cc File be/src/statestore/statestore-catalogd-mgr.cc: http://gerrit.cloudera.org:8080/#/c/20276/1/be/src/statestore/statestore-catalogd-mgr.cc@126 PS1, Line 126: DCHECK(num_registered_catalogd_ == 2); > Yes. CatalogD HA is controlled by central manager or k8s server in producti nit: In that case, can we add explicit error message here that Statestore does not permit more than 2 CatalogD registration? maybe: DCHECK(num_registered_catalogd_ == 2) << "No more than 2 CatalogD registration is allowed!"; Not asking for this patch, but I think Statestore should be resilient against this possibility, maybe by just ignoring the third registration. http://gerrit.cloudera.org:8080/#/c/20276/1/be/src/statestore/statestore-subscriber.h File be/src/statestore/statestore-subscriber.h: http://gerrit.cloudera.org:8080/#/c/20276/1/be/src/statestore/statestore-subscriber.h@116 PS1, Line 116: /// for two events: receive registration RPC reply, receive UpdateCatalogD RPC request. : /// is_registration_reply: set it as true when the function is called for the > done. Done http://gerrit.cloudera.org:8080/#/c/20276/1/be/src/statestore/statestore-subscriber.cc File be/src/statestore/statestore-subscriber.cc: http://gerrit.cloudera.org:8080/#/c/20276/1/be/src/statestore/statestore-subscriber.cc@593 PS1, Line 593: // registration. Note that a > Added DCHECK Done http://gerrit.cloudera.org:8080/#/c/20276/1/be/src/statestore/statestore.cc File be/src/statestore/statestore.cc: http://gerrit.cloudera.org:8080/#/c/20276/1/be/src/statestore/statestore.cc@1332 PS1, Line 1332: if (update_skipped) { : // The subscriber skipped processing this update. It's not considered as a failure : // since subscribers can decide what they do with any update. The subscriber is : // left in the receiver list so that RPC will be resent to it in next round. : ++it; > Added log messages in subscriber side. Done -- To view, visit http://gerrit.cloudera.org:8080/20276 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie49947e563d43c59bdd476b28c35be69848ae12a Gerrit-Change-Number: 20276 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Andrew Sherman <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Fri, 28 Jul 2023 00:58:42 +0000 Gerrit-HasComments: Yes
