Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17159 )
Change subject: KUDU-2612: acquire and release partition lock ...................................................................... Patch Set 5: (5 comments) http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc File src/kudu/integration-tests/txn_participant-itest.cc: http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc@813 PS3, Line 813: } > Done This is marked 'Done' but I don't see that a scenario with calling ParticipateInTransaction(..., ParticipantOpPB::BEGIN_TXN) for the same txn_id multiple times is added. Did I miss anything? http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc@930 PS3, Line 930: > Done nit: this is marked as 'Done', and I saw TODO is added in other places. Is it worth adding that comment here as well? http://gerrit.cloudera.org:8080/#/c/17159/5/src/kudu/integration-tests/txn_participant-itest.cc File src/kudu/integration-tests/txn_participant-itest.cc: http://gerrit.cloudera.org:8080/#/c/17159/5/src/kudu/integration-tests/txn_participant-itest.cc@282 PS5, Line 282: const auto& actual_txns = replicas[i]->tablet()->txn_participant()->GetTxnsForTests(); : ASSERT_EQ(txns.size(), actual_txns.size()); : ASSERT_EQ(txns, actual_txns); nit: is this change essential? If not, maybe it's better to return back to the version how it was in PS3? I guess with this version as in PS5, if the assertion ever triggers (even with ASSERT_EVENTUALLY), there isn't as much useful information anymore because the assertion on the size strikes first. http://gerrit.cloudera.org:8080/#/c/17159/5/src/kudu/tablet/ops/participant_op.cc File src/kudu/tablet/ops/participant_op.cc: http://gerrit.cloudera.org:8080/#/c/17159/5/src/kudu/tablet/ops/participant_op.cc@212 PS5, Line 212: RETURN_NOT_OK(replica->tablet()->AcquirePartitionLock(state_.get(), wait_mode)); How to release the partition lock if the following code kicks in when the driver run Status ParticipantOp::Prepare(): https://github.com/apache/kudu/blob/1eb2a2fbca329721ec7152714b7c95374754044b/src/kudu/tablet/ops/op_driver.cc#L298-L305 Basically, how to guarantee the lock is to be released in all possible outcomes in the OpDriver's code once the lock is taken? http://gerrit.cloudera.org:8080/#/c/17159/5/src/kudu/tablet/tablet.h File src/kudu/tablet/tablet.h: http://gerrit.cloudera.org:8080/#/c/17159/5/src/kudu/tablet/tablet.h@172 PS5, Line 172: // Similar to the above but for participant op. : Status AcquirePartitionLock(ParticipantOpState* op_state, : LockManager::LockWaitMode wait_mode); BTW, why do we need this method when we already have a similar one for WriteOpState? Would it be possible to acquire partition lock only when performing a write operation? If not, could you please add a blurb explaining why it's also necessary to acquire a lock while working with ParticipantOpState as well? -- To view, visit http://gerrit.cloudera.org:8080/17159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If26733cae16810f3b3afd1fd05dcb984e6366939 Gerrit-Change-Number: 17159 Gerrit-PatchSet: 5 Gerrit-Owner: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 26 Mar 2021 03:34:08 +0000 Gerrit-HasComments: Yes