Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17159 )
Change subject: KUDU-2612: acquire and release partition lock ...................................................................... Patch Set 3: (11 comments) http://gerrit.cloudera.org:8080/#/c/17159/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17159/3//COMMIT_MSG@7 PS3, Line 7: acquire and release partition lock Do you plan to add an end-to-end test scenario into client.cc or alike to verify how this works if looking from the client side? http://gerrit.cloudera.org:8080/#/c/17159/3//COMMIT_MSG@13 PS3, Line 13: parition partition http://gerrit.cloudera.org:8080/#/c/17159/3//COMMIT_MSG@16 PS3, Line 16: will be aborted or retried Do you mind extending this a bit to explain what actor is to abort or retry corresponding transaction/write op? http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/client/batcher.cc File src/kudu/client/batcher.cc: http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/client/batcher.cc@530 PS3, Line 530: // Status::Abort() originated from TabletServerErrorPB::TXN_LOCKED_ABORT : // response are needlessly retried. : > Isn't this handled above? Or do you mean TXN_LOCKED_RETRY? We'd probably be +1 If that's about TXN_LOCKED_RETRY, I agree that it's better to handle TXN_LOCKED_RETRY as a specific error code than generic IsIllegalState() http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc File src/kudu/integration-tests/txn_participant-itest.cc: http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc@813 PS3, Line 813: TEST_F(TxnParticipantITest, TestTxnSystemClientBeginTxnLockAbort) { Do you mind adding a scenario with calling ParticipateInTransaction() for the same txn_id multiple times? http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc@833 PS3, Line 833: s.IsAborted Hmm, interesting: what actor is to abort the transaction? I'm curious about the bigger picture: I was under impression that it's the TxnManager who takes care of that, right? Could you add a comment clarifying that at this point kSecondTxn isn't aborted yet (well, it's not even started, actually, right)? http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc@930 PS3, Line 930: FLAGS_enable_txn_partition_lock = false; It would be great if you could add a note explaining that this scenario became a fully synthetic one with the introduction of coarse-grain locking. It might be a TODO reminding to remove the override for the flag once fine-grained locking appears. http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/participant_op.h File src/kudu/tablet/ops/participant_op.h: http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/participant_op.h@75 PS3, Line 75: AcquirePartitionLock Can this method be invoked as a part of larger operation which might have a deadline? If so, should this method take an optional 'deadline' parameter as well? http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/participant_op.cc File src/kudu/tablet/ops/participant_op.cc: http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/participant_op.cc@123 PS3, Line 123: TRACE("Released partition lock"); Probably, we want to distinguish in the trace whether this is was a no-op or actual release of the lock, no? http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/tablet.h File src/kudu/tablet/tablet.h: http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/tablet.h@165 PS3, Line 165: If 'must_acquire' is true, : // then wait until the lock is acquired Write operations are executed in the context of an RPC which have a timeout. What happens if a lock is still being acquired but the operation has timed out? Could this end up in accumulating too many ops in the queue? http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/tablet.h@166 PS3, Line 166: return : // 'TXN_LOCKED_ABORT' or 'TXN_LOCKED_RETRY_OP' error How is it to return those if the return type of this method is Status? -- To view, visit http://gerrit.cloudera.org:8080/17159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If26733cae16810f3b3afd1fd05dcb984e6366939 Gerrit-Change-Number: 17159 Gerrit-PatchSet: 3 Gerrit-Owner: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Thu, 11 Mar 2021 22:11:50 +0000 Gerrit-HasComments: Yes