Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17159 )

Change subject: KUDU-2612: acquire and release partition lock
......................................................................


Patch Set 3:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/17159/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17159/3//COMMIT_MSG@7
PS3, Line 7: acquire and release partition lock
Do you plan to add an end-to-end test scenario into client.cc or alike to 
verify how this works if looking from the client side?


http://gerrit.cloudera.org:8080/#/c/17159/3//COMMIT_MSG@13
PS3, Line 13: parition
partition


http://gerrit.cloudera.org:8080/#/c/17159/3//COMMIT_MSG@16
PS3, Line 16: will be aborted or retried
Do you mind extending this a bit to explain what actor is to abort or retry 
corresponding transaction/write op?


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/client/batcher.cc@530
PS3, Line 530:     //                Status::Abort() originated from 
TabletServerErrorPB::TXN_LOCKED_ABORT
             :     //                response are needlessly retried.
             :
> Isn't this handled above? Or do you mean TXN_LOCKED_RETRY? We'd probably be
+1

If that's about TXN_LOCKED_RETRY, I agree that it's better to handle 
TXN_LOCKED_RETRY as a specific error code than generic IsIllegalState()


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc
File src/kudu/integration-tests/txn_participant-itest.cc:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc@813
PS3, Line 813: TEST_F(TxnParticipantITest, 
TestTxnSystemClientBeginTxnLockAbort) {
Do you mind adding a scenario with calling ParticipateInTransaction() for the 
same txn_id multiple times?


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc@833
PS3, Line 833: s.IsAborted
Hmm, interesting: what actor is to abort the transaction?  I'm curious about 
the bigger picture: I was under impression that it's the TxnManager who takes 
care of that, right?  Could you add a comment clarifying that at this point 
kSecondTxn isn't aborted yet (well, it's not even started, actually, right)?


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc@930
PS3, Line 930: FLAGS_enable_txn_partition_lock = false;
It would be great if you could add a note explaining that this scenario became 
a fully synthetic one with the introduction of coarse-grain locking.  It might 
be a TODO reminding to remove the override for the flag once fine-grained 
locking appears.


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/participant_op.h
File src/kudu/tablet/ops/participant_op.h:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/participant_op.h@75
PS3, Line 75: AcquirePartitionLock
Can this method be invoked as a part of larger operation which might have a 
deadline?   If so, should this method take an optional 'deadline' parameter as 
well?


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/participant_op.cc
File src/kudu/tablet/ops/participant_op.cc:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/participant_op.cc@123
PS3, Line 123:   TRACE("Released partition lock");
Probably, we want to distinguish in the trace whether this is was a no-op or 
actual release of the lock, no?


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/tablet.h@165
PS3, Line 165: If 'must_acquire' is true,
             :   // then wait until the lock is acquired
Write operations are executed in the context of an RPC which have a timeout.  
What happens if a lock is still being acquired but the operation has timed out? 
 Could this end up in accumulating too many ops in the queue?


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/tablet.h@166
PS3, Line 166: return
             :   // 'TXN_LOCKED_ABORT' or 'TXN_LOCKED_RETRY_OP' error
How is it to return those if the return type of this method is Status?



--
To view, visit http://gerrit.cloudera.org:8080/17159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If26733cae16810f3b3afd1fd05dcb984e6366939
Gerrit-Change-Number: 17159
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 11 Mar 2021 22:11:50 +0000
Gerrit-HasComments: Yes

Reply via email to