Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17097 )

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
......................................................................


Patch Set 2:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/17097/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17097/2//COMMIT_MSG@17
PS2, Line 17: transaction
> nit: you meant a write operation here, not the whole transaction, right?
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager-test.cc
File src/kudu/tablet/lock_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager-test.cc@159
PS2, Line 159: TEST_F(LockManagerTest, TestLockUnlockPartitionMultiTxn) {
             :   // Acquiring a lock that held by another transaction which has 
a lower txn ID
             :   // will get a 'TXN_LOCKED_ABORT' server error code.
             :   {
             :     TxnId txn1(0);
             :     ScopedPartitionLock first_lock(&lock_manager_, 
LockManager::LOCK_EXCLUSIVE, txn1);
             :     TabletServerErrorPB::Code code = 
TabletServerErrorPB::UNKNOWN_ERROR;
             :     ASSERT_TRUE(first_lock.Acquired(&code));
             :     ASSERT_EQ(TabletServerErrorPB::UNKNOWN_ERROR, code);
             :     ASSERT_EQ(1, lock_manager_.partition_lock_refs());
             :
             :     TxnId txn2(1);
             :     ScopedPartitionLock second_lock(&lock_manager_, 
LockManager::LOCK_EXCLUSIVE, txn2);
             :     ASSERT_FALSE(second_lock.Acquired(&code));
             :     ASSERT_EQ(TabletServerErrorPB::TXN_LOCKED_ABORT, code);
             :   }
> nit: probably makes sense to split TXN_LOCKED_ABORT code testing into a sep
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager-test.cc@179
PS2, Line 179:     TxnId txn1(1);
             :     ScopedPartitionLock first_lock(&lock_manager_, 
LockManager::LOCK_EXCLUSIVE, txn1);
             :     TabletServerErrorPB::Code code = 
TabletServerErrorPB::UNKNOWN_ERROR;
             :     ASSERT_TRUE(first_lock.Acquired(&code));
             :     ASSERT_EQ(TabletServerErrorPB::UNKNOWN_ERROR, code);
             :     ASSERT_EQ(1, lock_manager_.partition_lock_refs());
             :
             :     TxnId txn2(0);
> nit: the values of these transaction IDs are pretty confusing IMO. Might be
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager-test.cc@223
PS2, Line 223: }
> nit: are there any post-conditions to check after releasing all the row loc
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h
File src/kudu/tablet/lock_manager.h:

http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@46
PS2, Line 46: // Super-simple lock manager implementation. This only supports 
exclusive
            : // locks, and makes no attempt to prevent deadlocks if a single 
thread
            : // takes multiple locks.
            : //
            : // In the future when we want to support multi-row transactions 
of some kind
            : // we'll have to implement a proper lock manager with all its 
trappings,
            : // but this should be enough for the single-row use case.
> nit: maybe, update this to reflect changes in this patch?
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@85
PS2, Line 85:   // Lock to protect 'partition_lock_' and
            :   // 'partition_lock_refs_'.
> nit: might be at a single line
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@90
PS2, Line 90: by
            :   // the same transaction.
> nit: do you mind extending the comment to explain where the information to
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@154
PS2, Line 154: class PartitionLock {
> nit: mind adding a short description for this class?
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@181
PS2, Line 181: assignment
> nit: assignment operator
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@188
PS2, Line 188: Acquired
> style nit: rename to 'IsAcquired' or 'is_acquired'
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@189
PS2, Line 189: Release
> nit: add a doc?
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc
File src/kudu/tablet/lock_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@365
PS2, Line 365: LockManager::LockMode mode,
> I could not find where this parameter is used.  Did I miss anything?
Yes, this is not used. It is actually not used anywhere in the codebase. I 
added it just to be symmetric with the definition of ScopedRowLock. My 
understanding is it may be used later when we introduce shared lock. But I can 
remove it if you think it is redundant.


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@368
PS2, Line 368: code_
> Why bother storing this at all if we evaluate it every time we create a Sco
So I think the idea is to be symmetric with the destructor. So that as the 
class name indicate, as long as the ScopedPartitionLock is in the scope. The 
lock is held without calling Acquired().


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@369
PS2, Line 369: DCHECK_EQ(LockManager::LOCK_EXCLUSIVE, mode);
> If this should always be LockManager::LOCK_EXCLUSIVE, why not to drop the '
I will remove it.


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@390
PS2, Line 390: manager_->ReleasePartitionLock();
> Does 'lock_' needs to be updated after calling ReleasePartitionLock() ?
Hmm, originally I don't think so, as it should not be used it after Release(). 
However after think it more, it may be more safe to set it to nullptr.


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@392
PS2, Line 392:
             : ScopedPartitionLock::ScopedPartitionLock(ScopedPartitionLock&& 
other) noexcept {
             :   TakeState(&other);
             : }
             :
             : ScopedPartitionLock& 
ScopedPartitionLock::operator=(ScopedPartitionLock&& other) noexcept {
             :   TakeState(&other);
             :   return *this;
             : }
> +1
These are actually a similar implementation as the ones in of ScopedRowLock. 
Since I image a similar usage pattern (std::move to transfer the lock) will 
occur when requiring PartitionLock at each write op. However, as I don't have 
enough confidence of how this will be used later. So I decided to remove it in 
this patch. And add it back whenever actually required. Let me know if you 
think otherwise.


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@415
PS2, Line 415: ()
> nit: the parentheses are not required here
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@422
PS2, Line 422: PartitionLock* LockManager::AcquirePartitionLock(
             :     const TxnId& txn_id,
             :     TabletServerErrorPB::Code* code) {
> What happens if a non-transaction attempts to take the lock while a transac
Similar to Alexey's question, at this point I assume the caller is 
transactional op. So no lock will be returned in this case.

Sure, will do.


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@425
PS2, Line 425: PREDICT_TRUE
> Do we expect this to be used by non-transactional operations as well?  If s
Yeah, good point. At this point, I image non-transactional ops will not try to 
acquire the partition lock as it is not required. However, this may change 
depends on how the partition lock is checked in a following patch.


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@434
PS2, Line 434:       if (PREDICT_FALSE(txn_id.value() > 
partition_lock_->txn_id().value())) {
             :         *code = tserver::TabletServerErrorPB::TXN_LOCKED_ABORT;
             :         return nullptr;
             :       }
             :       if (PREDICT_FALSE(txn_id.value() < 
partition_lock_->txn_id().value())) {
             :         *code = tserver::TabletServerErrorPB::TXN_LOCKED_RETRY;
             :         return nullptr;
             :       }
> Looking at this code makes me think that the most anticipated case if when
Right, since transaction will try to hold the partition lock when the txn 
begins. If the txn cannot acquire the lock at that moment, it will be aborted 
or retried. There will be no following write ops which are the majority times 
when a partition lock is checked.


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@449
PS2, Line 449:   } else {
             :     return nullptr;
             :   }
> readability nit: put this condition first and remove the extra 'else' claus
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tserver/tserver.proto
File src/kudu/tserver/tserver.proto:

http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tserver/tserver.proto@116
PS2, Line 116: transaction
> nit: to be clear, the whole transaction doesn't need to be retried, just th
Right, although my understanding is along with write op, the participant op 
(Begin TXN) can also be retried if the lock is held by other transactions at 
that moment.


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tserver/tserver.proto@118
PS2, Line 118: TXN_LOCKED_RETRY
> nit: maybe, rename into TXN_LOCKED_RETRY_OP to signify that it's an operati
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 2
Gerrit-Owner: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 24 Feb 2021 01:44:14 +0000
Gerrit-HasComments: Yes

Reply via email to