Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16952 )

Change subject: wip KUDU-2612: background task to commit transaction
......................................................................


Patch Set 4:

(16 comments)

Overall looks pretty good after a quick initial look.

I haven't dug into the logic of the orchestration tasks -- I'm planning to do 
that over the coming weekend.

http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/client/client-test.cc@7200
PS4, Line 7200: true /* wait */
nit here and below where changed from 'false' --> 'true': consider not 
specifying the 'wait' parameter, since it's 'true' by default


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/client/client-test.cc@7201
PS4, Line 7201:       // TODO(aserbin): when txn lifecycle is properly 
implemented, inject a
              :       //                delay into the txn finalizing code to 
make sure
              :       //                the transaction stays in the 
COMMIT_IN_PROGRESS state
              :       //                for a while
nit: I guess this should be removed once the scenario updated to -- I guess 
it's doesn't make much sense to fuss around artificial delays once the real 
transaction orchestration is ready.


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/client/client-test.cc@7217
PS4, Line 7217: true
nit: maybe, set it to 'false' to be complete sure that IsCommitComplete() works 
as expected


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/integration-tests/txn_commit-itest.cc
File src/kudu/integration-tests/txn_commit-itest.cc:

http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/integration-tests/txn_commit-itest.cc@222
PS4, Line 222:     // TODO(awong): expose transaction ID as public API? Might 
be useful for
             :     // debugging and general observability.
             :     int64_t txn_id = cur_txn_id_++;
Actually, there is way of doing this without exposing transaction identifiers 
as part of public API -- it's a undocumented feature and it relies on 
implementation specific details, but it's fine to be used in tests.

The idea is to serialize the transaction token and then peek into the 
TxnTokenPB::txn_id, assuming the serialized token string is just a serialized 
TxnTokenPB instance (it's so at this point).  You can find an example of this 
trick at 
https://github.com/apache/kudu/blob/948f92e787136f12dbfb3e7195fef24db0be0088/src/kudu/client/client-test.cc#L441-L448


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/integration-tests/txn_commit-itest.cc@283
PS4, Line 283: 10
nit: consider introducing a constant and using it here and below


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/integration-tests/txn_commit-itest.cc@291
PS4, Line 291: /*wait*/true
nit: if it makes sense, consider omitting the 'wait' parameter since here since 
it's 'true' by default


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/integration-tests/txn_commit-itest.cc@316
PS4, Line 316:   Status completion_status;
             :   bool is_complete;
             :   Status s = txn->IsCommitComplete(&is_complete, 
&completion_status);
             :   ASSERT_TRUE(s.IsTimedOut()) << s.ToString();
nit: how long does it take to complete?  If more than 3 seconds, consider 
declaring this scenario as 'slow', requiring KUDU_ALLOW_SLOW_TESTS=1 to run


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/integration-tests/txn_commit-itest.cc@663
PS4, Line 663:   ASSERT_TRUE(completion_status.IsIncomplete());
nit: maybe, add ' << completion_status.ToString()' to diagnose a failure (if 
any ever occurs)?


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/integration-tests/txn_commit-itest.cc@680
PS4, Line 680:
Just in case I didn't miss it among those scenarios: it would be great to add a 
scenario to make sure the client doesn't need to send keepalive messages for a 
transaction once it initiated committing it.  I guess the scenario would be 
something like:
  * delay all those commit orchestration tasks
  * start a transaction, resulting in a 'txn' handle
  * call txn->Commit(false)
  * remove the txn out of the scope (so it starts automatic heartbeating)
  * make sure the transaction isn't aborted automatically but rather committed 
after the delayed orchestration tasks are finally run


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_status_manager.h
File src/kudu/transactions/txn_status_manager.h:

http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_status_manager.h@80
PS4, Line 80: int i
nit: it would be nice to document the parameter and give it a bit more 
meaningful name


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_status_manager.cc
File src/kudu/transactions/txn_status_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_status_manager.cc@137
PS4, Line 137:   // Status callback called with the result from the participant 
op. This is
nit: add a DCHECK() for the index 'i' in 'participant_ids_'?


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_status_manager.cc@208
PS4, Line 208: finalize commit
nit: if it makes sense, unify using 'finalize commit' and 'FINALIZE_COMMIT' in 
the log messages


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_status_manager.cc@253
PS4, Line 253: CHECK_OK
What if the commit_pool_ is being shut down?  Is it supposed to crash?


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_system_client.cc
File src/kudu/transactions/txn_system_client.cc:

http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_system_client.cc@449
PS4, Line 449:   Status s;
             :   while (MonoTime::Now() < deadline) {
             :     s = GetClient(client);
             :     if (PREDICT_TRUE(s.ok())) {
             :       DCHECK(*client);
             :       return Status::OK();
             :     }
             :     SleepFor(MonoDelta::FromMilliseconds(100));
             :   }
nit: probably, rewriting this using

do {
  ...
} while ()

might be a slightly better approach it's expected to have the client 
initialized most of the times -- that way there wouldn't be an extra call to 
MonoTime::Now()


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/tserver/tablet_server.cc
File src/kudu/tserver/tablet_server.cc:

http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/tserver/tablet_server.cc@48
PS4, Line 48: using kudu::transactions::TxnSystemClient;
nit: is this really needed here?


http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/tserver/ts_tablet_manager.cc@1275
PS4, Line 1275:   // Now that our TxnStatusManagers have shut down, clean up 
out threadpool.
              :   txn_commit_pool_->Shutdown();
Is this the second time calling shutdown on the same pool?  I'd expect that 
calling this at line 1258 should be enough already.



--
To view, visit http://gerrit.cloudera.org:8080/16952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2258dded3ab3d527cb5d0abdc7d5e7deb4da15e
Gerrit-Change-Number: 16952
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Sat, 30 Jan 2021 04:49:00 +0000
Gerrit-HasComments: Yes

Reply via email to