Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: (wip) KUDU-2612: restrict TxnStatusManager calls to be made by 
the leader only
......................................................................


Patch Set 3:

(13 comments)

Mostly nits, but there are a few places that seem a little suspicious w.r.t 
locking.

http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/master/txn_manager-test.cc
File src/kudu/master/txn_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/master/txn_manager-test.cc@137
PS3, Line 137:  // Find the replica.
             :       tablet_replica_ = LookupTabletReplica();
nit: spacing

Also why is this necessary? Doesn't the master hold a reference to this 
replica? Is this meant to protect against a race?


http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/tablet/tablet_replica.cc@285
PS3, Line 285:   {
Do we need to take the state change lock here? Is it OK if we proceed while 
racing with shutdown?


http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/tablet/tablet_replica.cc@286
PS3, Line 286:     std::unique_lock<simple_spinlock> lock(lock_);
nit: since we're not using 'lock', we can just use std::lock_guard<> here


http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/tablet/tablet_replica.cc@312
PS3, Line 312:     // Submit to pool
             :     if (shutdown_latch_->count() == 0) {
             :      return;
             :     }
Do we still need this if we take the state change lock and hold it for the 
duration of this function?


http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/tablet/tablet_replica.cc@877
PS3, Line 877: RegisterTxnCoordinator
nit: this doesn't seem to be "registering" anything per se. Maybe call this 
MarkTxnCoordinatorRegistered() instead?


http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/tablet/tablet_replica.cc@887
PS3, Line 887: tablet_
nit: tablet_id()?


http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/tablet/tablet_replica.cc@887
PS3, Line 887: "Not registering transaction coordinator for " << tablet_
             :                  << ": tablet not in RUNNING state";
nit: use Substitute() for better readability. Also since this isn't _not_ 
registering, so maybe instead say "Registering transaction coordinator while 
tablet is not RUNNING" or somesuch.


http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/transactions/txn_status_manager.h
File src/kudu/transactions/txn_status_manager.h:

http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/transactions/txn_status_manager.h@110
PS3, Line 110: specified
nit: I might've missed this; where is this specified?


http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/transactions/txn_status_manager.cc
File src/kudu/transactions/txn_status_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/transactions/txn_status_manager.cc@204
PS3, Line 204:
nit: drop space


http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/transactions/txn_status_manager.cc@356
PS3, Line 356:   auto cleanup = MakeScopedCleanup([&]() {
             :     
status_tablet_.tablet_replica_->UpdateTxnCoordinatorReloadTask(false);
             :   });
nit: if we don't intend on cancelling this, we don't need to define a variable 
for it, and we can just use the SCOPED_CLEANUP macro instead


http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/transactions/txn_status_manager.cc@434
PS3, Line 434:       if (!check([this]() { return 
this->LoadFromTabletUnlocked(); },
             :                  *consensus, term, kLoadMetaOpDescription).ok()) 
{
             :         return;
             :       }
nit: Why bother putting this in a lambda? IMO it makes it more difficult to 
read, since readers have to reason about a couple layers of function calls here.


http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/tserver/ts_tablet_manager.cc@1426
PS3, Line 1426: if (r->txn_coordinator_registered()) {
It seems like there's still room for a TOCTOU race here if right after this 
check, we deleted the tablet.


http://gerrit.cloudera.org:8080/#/c/16648/3/src/kudu/tserver/ts_tablet_manager.cc@1447
PS3, Line 1447:      auto cleanup = MakeScopedCleanup([&]() {
              :         r->UpdateTxnCoordinatorStalenessTask(false);
              :       });
nit: SCOPED_CLEANUP here too



--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 05 Jan 2021 23:42:19 +0000
Gerrit-HasComments: Yes

Reply via email to