Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16952 )
Change subject: KUDU-2612: background task to commit transaction ...................................................................... Patch Set 5: (20 comments) http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/client/client-test.cc File src/kudu/client/client-test.cc: http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/client/client-test.cc@7200 PS4, Line 7200: > nit here and below where changed from 'false' --> 'true': consider not spec Done http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/client/client-test.cc@7201 PS4, Line 7201: shared_ptr<KuduTransaction> txn; : ASSERT_OK(client_->NewTransaction(&txn)); : ASSERT_OK(txn->Commit()); : bool is_complete = false; > nit: I guess this should be removed once the scenario updated to -- I guess Done http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/client/client-test.cc@7217 PS4, Line 7217: > nit: maybe, set it to 'false' to be complete sure that IsCommitComplete() w Done http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/integration-tests/txn_commit-itest.cc File src/kudu/integration-tests/txn_commit-itest.cc: http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/integration-tests/txn_commit-itest.cc@222 PS4, Line 222: KuduScanner scanner(table.get()); : RETURN_NOT_OK(scanner.Open()); : int rows = 0; > Actually, there is way of doing this without exposing transaction identifie Ah, thanks for the pointer! Done http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/integration-tests/txn_commit-itest.cc@283 PS4, Line 283: kN > nit: consider introducing a constant and using it here and below Done http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/integration-tests/txn_commit-itest.cc@291 PS4, Line 291: itComplete(& > nit: if it makes sense, consider omitting the 'wait' parameter since here s Done http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/integration-tests/txn_commit-itest.cc@316 PS4, Line 316: shared_ptr<KuduTransaction> txn; : shared_ptr<KuduSession> txn_session; : ASSERT_OK(BeginTransaction(participant_ids_, &txn, &txn_session)); : FLAGS_txn_status_manager_inject_latency_fina > nit: how long does it take to complete? If more than 3 seconds, consider d Done http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/integration-tests/txn_commit-itest.cc@663 PS4, Line 663: ASSERT_TRUE(completion_status.IsIncomplete()) << completion_status.ToString(); > nit: maybe, add ' << completion_status.ToString()' to diagnose a failure (i Done http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/integration-tests/txn_commit-itest.cc@680 PS4, Line 680: > Just in case I didn't miss it among those scenarios: it would be great to a Done http://gerrit.cloudera.org:8080/#/c/16952/1/src/kudu/transactions/txn_status_entry.cc File src/kudu/transactions/txn_status_entry.cc: http://gerrit.cloudera.org:8080/#/c/16952/1/src/kudu/transactions/txn_status_entry.cc@a45 PS1, Line 45: > This condition no longer hold? No, e.g. when reloading the tablet after committing. Seems our tests didn't cover those codepaths previously. http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_status_manager.h File src/kudu/transactions/txn_status_manager.h: http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_status_manager.h@80 PS4, Line 80: > nit: it would be nice to document the parameter and give it a bit more mean Done http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_status_manager.cc File src/kudu/transactions/txn_status_manager.cc: http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_status_manager.cc@137 PS4, Line 137: using kudu::pb_util::SecureShortDebugString; > nit: add a DCHECK() for the index 'i' in 'participant_ids_'? Done http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_status_manager.cc@208 PS4, Line 208: such errors wou > nit: if it makes sense, unify using 'finalize commit' and 'FINALIZE_COMMIT' Done http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_status_manager.cc@253 PS4, Line 253: if (PR > What if the commit_pool_ is being shut down? Is it supposed to crash? Nice catch. I've updated the shutdown logic such that we always finish the commit tasks before shutting down the threadpool. http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_system_client.cc File src/kudu/transactions/txn_system_client.cc: http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/transactions/txn_system_client.cc@449 PS4, Line 449: Status s; : do { : s = GetClient(client); : if (PREDICT_TRUE(s.ok())) { : DCHECK(*client); : return Status::OK(); : } : SleepFor(MonoDelta::FromMilliseconds(100)); : } > nit: probably, rewriting this using Done http://gerrit.cloudera.org:8080/#/c/16952/1/src/kudu/tserver/tablet_server.cc File src/kudu/tserver/tablet_server.cc: http://gerrit.cloudera.org:8080/#/c/16952/1/src/kudu/tserver/tablet_server.cc@107 PS1, Line 107: client_initializer_.reset(new TxnSystemClientInitializer); > I sounds like a good idea: it might be the case when the client cannot conn Done http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/tserver/tablet_server.cc File src/kudu/tserver/tablet_server.cc: http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/tserver/tablet_server.cc@48 PS4, Line 48: using kudu::transactions::TxnSystemClientI > nit: is this really needed here? Done http://gerrit.cloudera.org:8080/#/c/16952/1/src/kudu/tserver/ts_tablet_manager.h File src/kudu/tserver/ts_tablet_manager.h: http://gerrit.cloudera.org:8080/#/c/16952/1/src/kudu/tserver/ts_tablet_manager.h@415 PS1, Line 415: // Thread pool used to reload transaction status tablets asynchronously. > nit: add a doc Done http://gerrit.cloudera.org:8080/#/c/16952/1/src/kudu/tserver/ts_tablet_manager.cc File src/kudu/tserver/ts_tablet_manager.cc: http://gerrit.cloudera.org:8080/#/c/16952/1/src/kudu/tserver/ts_tablet_manager.cc@1258 PS1, Line 1258: break; > shut down the txn_commit_pool_ when tserver is shutting down? Done http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/tserver/ts_tablet_manager.cc File src/kudu/tserver/ts_tablet_manager.cc: http://gerrit.cloudera.org:8080/#/c/16952/4/src/kudu/tserver/ts_tablet_manager.cc@1275 PS4, Line 1275: : // Signal the only task runni > Is this the second time calling shutdown on the same pool? I'd expect that Oops I want to keep this one, since it happens after our replicas have shut down. -- To view, visit http://gerrit.cloudera.org:8080/16952 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie2258dded3ab3d527cb5d0abdc7d5e7deb4da15e Gerrit-Change-Number: 16952 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Mon, 01 Feb 2021 08:39:31 +0000 Gerrit-HasComments: Yes