Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17097 )
Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock ...................................................................... Patch Set 2: (23 comments) http://gerrit.cloudera.org:8080/#/c/17097/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17097/2//COMMIT_MSG@17 PS2, Line 17: transaction > nit: you meant a write operation here, not the whole transaction, right? Done http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager-test.cc File src/kudu/tablet/lock_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager-test.cc@159 PS2, Line 159: TEST_F(LockManagerTest, TestLockUnlockPartitionMultiTxn) { : // Acquiring a lock that held by another transaction which has a lower txn ID : // will get a 'TXN_LOCKED_ABORT' server error code. : { : TxnId txn1(0); : ScopedPartitionLock first_lock(&lock_manager_, LockManager::LOCK_EXCLUSIVE, txn1); : TabletServerErrorPB::Code code = TabletServerErrorPB::UNKNOWN_ERROR; : ASSERT_TRUE(first_lock.Acquired(&code)); : ASSERT_EQ(TabletServerErrorPB::UNKNOWN_ERROR, code); : ASSERT_EQ(1, lock_manager_.partition_lock_refs()); : : TxnId txn2(1); : ScopedPartitionLock second_lock(&lock_manager_, LockManager::LOCK_EXCLUSIVE, txn2); : ASSERT_FALSE(second_lock.Acquired(&code)); : ASSERT_EQ(TabletServerErrorPB::TXN_LOCKED_ABORT, code); : } > nit: probably makes sense to split TXN_LOCKED_ABORT code testing into a sep Done http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager-test.cc@179 PS2, Line 179: TxnId txn1(1); : ScopedPartitionLock first_lock(&lock_manager_, LockManager::LOCK_EXCLUSIVE, txn1); : TabletServerErrorPB::Code code = TabletServerErrorPB::UNKNOWN_ERROR; : ASSERT_TRUE(first_lock.Acquired(&code)); : ASSERT_EQ(TabletServerErrorPB::UNKNOWN_ERROR, code); : ASSERT_EQ(1, lock_manager_.partition_lock_refs()); : : TxnId txn2(0); > nit: the values of these transaction IDs are pretty confusing IMO. Might be Done http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager-test.cc@223 PS2, Line 223: } > nit: are there any post-conditions to check after releasing all the row loc Done http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h File src/kudu/tablet/lock_manager.h: http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@46 PS2, Line 46: // Super-simple lock manager implementation. This only supports exclusive : // locks, and makes no attempt to prevent deadlocks if a single thread : // takes multiple locks. : // : // In the future when we want to support multi-row transactions of some kind : // we'll have to implement a proper lock manager with all its trappings, : // but this should be enough for the single-row use case. > nit: maybe, update this to reflect changes in this patch? Done http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@85 PS2, Line 85: // Lock to protect 'partition_lock_' and : // 'partition_lock_refs_'. > nit: might be at a single line Done http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@90 PS2, Line 90: by : // the same transaction. > nit: do you mind extending the comment to explain where the information to Done http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@154 PS2, Line 154: class PartitionLock { > nit: mind adding a short description for this class? Done http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@181 PS2, Line 181: assignment > nit: assignment operator Done http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@188 PS2, Line 188: Acquired > style nit: rename to 'IsAcquired' or 'is_acquired' Done http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@189 PS2, Line 189: Release > nit: add a doc? Done http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc File src/kudu/tablet/lock_manager.cc: http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@365 PS2, Line 365: LockManager::LockMode mode, > I could not find where this parameter is used. Did I miss anything? Yes, this is not used. It is actually not used anywhere in the codebase. I added it just to be symmetric with the definition of ScopedRowLock. My understanding is it may be used later when we introduce shared lock. But I can remove it if you think it is redundant. http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@368 PS2, Line 368: code_ > Why bother storing this at all if we evaluate it every time we create a Sco So I think the idea is to be symmetric with the destructor. So that as the class name indicate, as long as the ScopedPartitionLock is in the scope. The lock is held without calling Acquired(). http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@369 PS2, Line 369: DCHECK_EQ(LockManager::LOCK_EXCLUSIVE, mode); > If this should always be LockManager::LOCK_EXCLUSIVE, why not to drop the ' I will remove it. http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@390 PS2, Line 390: manager_->ReleasePartitionLock(); > Does 'lock_' needs to be updated after calling ReleasePartitionLock() ? Hmm, originally I don't think so, as it should not be used it after Release(). However after think it more, it may be more safe to set it to nullptr. http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@392 PS2, Line 392: : ScopedPartitionLock::ScopedPartitionLock(ScopedPartitionLock&& other) noexcept { : TakeState(&other); : } : : ScopedPartitionLock& ScopedPartitionLock::operator=(ScopedPartitionLock&& other) noexcept { : TakeState(&other); : return *this; : } > +1 These are actually a similar implementation as the ones in of ScopedRowLock. Since I image a similar usage pattern (std::move to transfer the lock) will occur when requiring PartitionLock at each write op. However, as I don't have enough confidence of how this will be used later. So I decided to remove it in this patch. And add it back whenever actually required. Let me know if you think otherwise. http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@415 PS2, Line 415: () > nit: the parentheses are not required here Done http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@422 PS2, Line 422: PartitionLock* LockManager::AcquirePartitionLock( : const TxnId& txn_id, : TabletServerErrorPB::Code* code) { > What happens if a non-transaction attempts to take the lock while a transac Similar to Alexey's question, at this point I assume the caller is transactional op. So no lock will be returned in this case. Sure, will do. http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@425 PS2, Line 425: PREDICT_TRUE > Do we expect this to be used by non-transactional operations as well? If s Yeah, good point. At this point, I image non-transactional ops will not try to acquire the partition lock as it is not required. However, this may change depends on how the partition lock is checked in a following patch. http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@434 PS2, Line 434: if (PREDICT_FALSE(txn_id.value() > partition_lock_->txn_id().value())) { : *code = tserver::TabletServerErrorPB::TXN_LOCKED_ABORT; : return nullptr; : } : if (PREDICT_FALSE(txn_id.value() < partition_lock_->txn_id().value())) { : *code = tserver::TabletServerErrorPB::TXN_LOCKED_RETRY; : return nullptr; : } > Looking at this code makes me think that the most anticipated case if when Right, since transaction will try to hold the partition lock when the txn begins. If the txn cannot acquire the lock at that moment, it will be aborted or retried. There will be no following write ops which are the majority times when a partition lock is checked. http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@449 PS2, Line 449: } else { : return nullptr; : } > readability nit: put this condition first and remove the extra 'else' claus Done http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tserver/tserver.proto File src/kudu/tserver/tserver.proto: http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tserver/tserver.proto@116 PS2, Line 116: transaction > nit: to be clear, the whole transaction doesn't need to be retried, just th Right, although my understanding is along with write op, the participant op (Begin TXN) can also be retried if the lock is held by other transactions at that moment. http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tserver/tserver.proto@118 PS2, Line 118: TXN_LOCKED_RETRY > nit: maybe, rename into TXN_LOCKED_RETRY_OP to signify that it's an operati Done -- To view, visit http://gerrit.cloudera.org:8080/17097 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1 Gerrit-Change-Number: 17097 Gerrit-PatchSet: 2 Gerrit-Owner: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Wed, 24 Feb 2021 01:44:14 +0000 Gerrit-HasComments: Yes