Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16648 )
Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the leader only ...................................................................... Patch Set 7: (4 comments) Yep, I agree with Andrew: after quick looks this looks much better to me now! I hope to get one more pass tomorrow morning. As for now, just a few nits not related to the essence, but rather syntax and code conventions. http://gerrit.cloudera.org:8080/#/c/16648/6/src/kudu/client/client-test.cc File src/kudu/client/client-test.cc: http://gerrit.cloudera.org:8080/#/c/16648/6/src/kudu/client/client-test.cc@426 PS6, Line 426: dynamic_cast nit: I guess we prefer to use down_cast<> for this type of casts in the code; see src/kudu/gutil/casts.h for details http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/integration-tests/txn_status_table-itest.cc File src/kudu/integration-tests/txn_status_table-itest.cc: http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/integration-tests/txn_status_table-itest.cc@725 PS3, Line 725: FLAGS_leader_failure_max_missed_heartbeat_periods = 1; : FLAGS_raft_heartbeat_interval_ms = 30; > Yes, ran looped the test in TSAN and didn't see failure caused by this. Is The short heartbeat interval combined with just single missing heartbeat period prompted me to make sure this test eventually converges when run under ASAN/TSAN with --stress_cpu_threads=16 or alike. But since you verified it works as needed, it's great! http://gerrit.cloudera.org:8080/#/c/16648/7/src/kudu/tserver/tablet_service.cc File src/kudu/tserver/tablet_service.cc: http://gerrit.cloudera.org:8080/#/c/16648/7/src/kudu/tserver/tablet_service.cc@1262 PS7, Line 1262: dynamic_cast nit: use down_cast<> here as well? http://gerrit.cloudera.org:8080/#/c/16648/7/src/kudu/tserver/ts_tablet_manager.cc File src/kudu/tserver/ts_tablet_manager.cc: http://gerrit.cloudera.org:8080/#/c/16648/7/src/kudu/tserver/ts_tablet_manager.cc@1453 PS7, Line 1453: TxnStatusManager* txn_status_manager = : dynamic_cast<TxnStatusManager*>(coordinator); : TxnStatusManager::ScopedLeaderSharedLock l(txn_status_manager); Looking at this again and again prompts me to ask the following question: Is it possible to introduce a constructor of ScopedLeaderSharedLock with a parameter of TxnCoordinator* type? The idea is to perform the downcast (using down_cast or dynamic_cast) under the hood, and then report an error via fist_failed_status() in debug mode if the coordinator could not be downcasted to TxnStatusManager? What do you think? -- To view, visit http://gerrit.cloudera.org:8080/16648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a Gerrit-Change-Number: 16648 Gerrit-PatchSet: 7 Gerrit-Owner: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Thu, 14 Jan 2021 08:34:25 +0000 Gerrit-HasComments: Yes