Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17037 )

Change subject: KUDU-2612 tablet servers automatically register txn participants
......................................................................


Patch Set 6:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/client/client-test.cc@7694
PS6, Line 7694: }
> Maybe count the rows to verify the state?
Done


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/ops/participant_op.cc
File src/kudu/tablet/ops/participant_op.cc:

http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/ops/participant_op.cc@197
PS6, Line 197: 
RETURN_NOT_OK(tablet_replica_->UnregisterTxnOpDispatcher(txn_id()));
> So begin commit can time out if the there are still writes pending when the
Right: if there are pending operations in the context of a particular 
transaction, UnregisterTxnOpDispatcher() returns ServiceUnavailable().  It 
seems reasonable to me: how it's possible to start committing a transaction 
when there are still non-applied write operations?


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.h@155
PS6, Line 155: txn_participant_registrant
> nit: doc this?
Done


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.h@176
PS6, Line 176: void SubmitPendingTxnOps(int64_t txn_id);
             :   void CancelPendingTxnOps(int64_t txn_id, const Status& s);
> nit: would you mind doc these two? Thanks!
Done


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.h@405
PS6, Line 405:     //
> nit: spurious line
Done


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.h@456
PS6, Line 456: ready_, ops_queue_.
> nit: if so, would be nice to at least mention how users should reason about
Done


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.h@458
PS6, Line 458: bool ready_;
             :     bool unregistered_;
             :     Status inflight_status_;
             :     std::deque<std::unique_ptr<WriteOpState>> ops_queue_;
> nit: would you mind also doc these?
Done


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@79
PS6, Line 79: 2
> how this is evaluated. If default is 2, it doesn't seem to help a lot for b
It should help: keep in mind that clients usually buffer inserts/updates to 
many rows in one write request.  Maybe, the name for the flag isn't quite good?

I can share more data once I have performance tests to cover this.


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@552
PS6, Line 552: dispatcher = FindOrNull(txn_op_dispatchers_, txn_id);
             :     if (!dispatcher) {
             :       dispatcher = &EmplaceOrDie(
             :           &txn_op_dispatchers_,
             :           std::piecewise_construct,
             :           std::forward_as_tuple(txn_id),
             :           std::forward_as_tuple(this, 
FLAGS_tablet_max_pending_txn_write_ops));
             :     }
> Does LookupOrEmplace work here?
Done


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@567
PS6, Line 567: SubmitTxnWriteCb
> nit: seems we typically name callbacks based on what we're calling back fro
Sure, let's keep it consistent.


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@590
PS6, Line 590:     txn_op_dispatchers_.erase(it);
> Do we have any guarantees that we've submitted all the pending writes? Or a
Yes, we do: MarkUnregistered() would return non-OK status otherwise one line 
above.  In addition, TxnOpDispatcher doesn't accept any new operations once 
it's marked as unregistered (see TxnOpDispatcher::Dispatch()).


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@642
PS6, Line 642: txn_registrator
> nit: dispatcher? Same elsewhere
Done


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@1227
PS6, Line 1227: transaction is not open"
> Don't we also unregister if we cancel due to not being the leader? Eg in Su
Right -- TxnOpDispatcher::Unregister() is called upon various errors, and not 
being a leader (along with not being to register a participant) are the reasons.

I replaced it with "tablet replica could not accept txn write operation".  
That's still vague, but I don't think we want to expose TxnOpDispatcher state 
in the error messages.


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@1258
PS6, Line 1258: // Store the information necessary to recreate WriteOp 
instance: this is
              :       // useful if TabletReplica::SubmitWrite() returns non-OK 
status.
> nit: it'd be nice if you could mention the lifecycle of the request and res
These pointers are stored at this level (i.e. scope of the while() cycle) and 
cannot become bogus after calling std::move(op).  I'm not sure what to comment 
about here.


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@1304
PS6, Line 1304: Cleanup
> What's the point of returning a status here? Doesn't this always return 'st
Yep: I guess there isn't any point returning anything from Cancel() because the 
return code isn't used.  However, Cleanup() is used in Submit(), and it returns 
the status as needed.

Done.



--
To view, visit http://gerrit.cloudera.org:8080/17037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia383f7afd208c44695c57aab82e3818fa1712ce6
Gerrit-Change-Number: 17037
Gerrit-PatchSet: 6
Gerrit-Owner: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 20 Feb 2021 02:21:56 +0000
Gerrit-HasComments: Yes

Reply via email to