Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17159 )

Change subject: KUDU-2612: acquire and release partition lock
......................................................................


Patch Set 8:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc
File src/kudu/integration-tests/txn_participant-itest.cc:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc@813
PS3, Line 813:     auto resp_error = StatusFromPB(resp.error().status());
> This is marked 'Done' but I don't see that a scenario with calling Particip
Done as TxnParticipantITest.TestTxnSystemClientRepeatCalls


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc@930
PS3, Line 930: // TODO(awong): update this when impleme
> nit: this is marked as 'Done', and I saw TODO is added in other places.  Is
Done


http://gerrit.cloudera.org:8080/#/c/17159/5/src/kudu/integration-tests/txn_participant-itest.cc
File src/kudu/integration-tests/txn_participant-itest.cc:

http://gerrit.cloudera.org:8080/#/c/17159/5/src/kudu/integration-tests/txn_participant-itest.cc@282
PS5, Line 282:         ASSERT_EQ(txns, 
replicas[i]->tablet()->txn_participant()->GetTxnsForTests());
             :       });
             :     }
> nit: is this change essential?  If not, maybe it's better to return back to
Ah I added this because the list of transactions can be quite long, and a 
difference in count is a bit easier to decipher and think about than comparing 
the full lists that get printed out. I can revert it -- it was more useful for 
debugging and can be added back if we ever need to debug this further.


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/participant_op.h
File src/kudu/tablet/ops/participant_op.h:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/participant_op.h@75
PS3, Line 75: ors the given 'op_id
> Can this method be invoked as a part of larger operation which might have a
I'm going to punt on this. I think there'd be a decent amount of plumbing to 
get the client's deadline, and it'd also be for leaders only. I added a TODO 
instead.


http://gerrit.cloudera.org:8080/#/c/17159/5/src/kudu/tablet/ops/participant_op.cc
File src/kudu/tablet/ops/participant_op.cc:

http://gerrit.cloudera.org:8080/#/c/17159/5/src/kudu/tablet/ops/participant_op.cc@212
PS5, Line 212: "START. Timestamp: $0", 
clock::HybridClock::GetPhysicalValueMicros(state_->times
> How to release the partition lock if the following code kicks in when the d
If replication fails, the ScopedPartitionLock is dropped, and if that was the 
last reference to the partition lock, the lock is released.


http://gerrit.cloudera.org:8080/#/c/17159/5/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/17159/5/src/kudu/tablet/tablet.h@172
PS5, Line 172:   // Acquire a shared lock on the given transaction, to ensure 
the
             :   // transaction's state doesn't change while the given write is 
in flight.
             :   Status AcquireTxnLock(int64_t txn_id, WriteOpState* op_state);
> BTW, why do we need this method when we already have a similar one for Writ
IIRC from my discussion with Hao, doing this in the BEGIN_TXN lets us catch 
races earlier. OTOH, thinking about it more, we ought to instead have the goal 
of taking the lock for as little time as possible, so I've gone and updated 
this so we only take the lock for writes.


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/tablet.h@165
PS3, Line 165: If 'must_acquire' is true,
             :   // then wait until the lock is acquired
> Write operations are executed in the context of an RPC which have a timeout
The op will proceed past the timeout. It might be avoidable, but I'm not sure 
it's a dealbreaker. At least, outside of the dispatcher, we don't have checks 
along the write path for timeouts to exit early, and at least in the context of 
the OpDrivers, I don't think it's worth the fuss for a single op, given we have 
similarly timed acquires on the write path already.


http://gerrit.cloudera.org:8080/#/c/17159/5/src/kudu/tablet/txn_participant.cc
File src/kudu/tablet/txn_participant.cc:

http://gerrit.cloudera.org:8080/#/c/17159/5/src/kudu/tablet/txn_participant.cc@41
PS5, Line 41: namespace kudu {
> warning: using decl 'Substitute' is unused [misc-unused-using-decls]
Done



--
To view, visit http://gerrit.cloudera.org:8080/17159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If26733cae16810f3b3afd1fd05dcb984e6366939
Gerrit-Change-Number: 17159
Gerrit-PatchSet: 8
Gerrit-Owner: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 07 Apr 2021 07:48:47 +0000
Gerrit-HasComments: Yes

Reply via email to