Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16527 )
Change subject: WIP KUDU-2612 p??: initial implementation of TxnManager ...................................................................... Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/16527/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16527/1//COMMIT_MSG@14 PS1, Line 14: * Add many more tests, especially for BeginTransaction to make : txn_id allocation works as expected: as of now, it seems : we could get surprises until TxnStatusManager::BeginTransaction() : updated to avoid incrementing highest_txn_id_ and responding : back when the txn status tablet replica isn't a leader. nit: would it make sense to split up the txn ID assignment piece from the XST initialization logic from the less controversial boilerplate code? http://gerrit.cloudera.org:8080/#/c/16527/1//COMMIT_MSG@19 PS1, Line 19: Change the initialization of the TxnManager in master to : run on first txn-related request: that should fix the issue : of too many retries when masters are up and running, but : no tablet servers are around yet. Also, that should fix : many of currently failing tests scenarios which expect : there isn't any existing tables when they run right after : starting a mini cluster first time. We discussed this a bit on Slack earlier -- I like the idea of lazily creating and opening the transaction status table via a KuduOnceLambda or somesuch, though in that case we might need to watch out for block many threads on table initialization. Alternatively, ensure that as we are initializing the table, further calls to the table will return a retriable error. http://gerrit.cloudera.org:8080/#/c/16527/1/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/16527/1/src/kudu/master/catalog_manager.cc@3235 PS1, Line 3235: for (const auto& elem : table_ids_map_) { : const auto& table_info = elem.second; : if (table_info->type() != TableTypePB::DEFAULT_TABLE) { : continue; : } : tables->emplace_back(table_info); : } Making sure I understand why this is here -- will we need this if we go down the lazy XST initialization route? Internally, the table will exist, so I only blocked access from public APIs like ListTables. Not sure it makes sense to hide the system table from internal APIs. It seems there is also a functional difference in that the XST would not show up in the web UI. http://gerrit.cloudera.org:8080/#/c/16527/1/src/kudu/master/txn_manager-test.cc File src/kudu/master/txn_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/16527/1/src/kudu/master/txn_manager-test.cc@108 PS1, Line 108: ASSERT_TRUE(resp.has_txn_id()); nit: maybe check that these are all unique? http://gerrit.cloudera.org:8080/#/c/16527/1/src/kudu/master/txn_manager.cc File src/kudu/master/txn_manager.cc: http://gerrit.cloudera.org:8080/#/c/16527/1/src/kudu/master/txn_manager.cc@121 PS1, Line 121: txn_id_ nit: maybe call this next_txn_id_ so it's clear that it's, e.g. not the highest seen txn ID so far? http://gerrit.cloudera.org:8080/#/c/16527/1/src/kudu/master/txn_manager.cc@123 PS1, Line 123: static constexpr auto kMaxIterNum = : std::numeric_limits<decltype(try_txn_id)>::digits - 1; nit: It isn't clear to me what this is or why we're using it. Could you add comments about why this value? Also is this "the right" value or just a placeholder? http://gerrit.cloudera.org:8080/#/c/16527/1/src/kudu/master/txn_manager.cc@132 PS1, Line 132: if (txn_id_.compare_exchange_strong(stored_txn_id, try_txn_id + 1) || : stored_txn_id > try_txn_id) Should we do this exchange even if we didn't get a success? E.g. if one thread is continuing to retry a higher txn ID, and another thread just starts to call BeginTransaction(), we'd want that other thread to start trying with a higher txn ID, right? http://gerrit.cloudera.org:8080/#/c/16527/1/src/kudu/transactions/txn_system_client.cc File src/kudu/transactions/txn_system_client.cc: http://gerrit.cloudera.org:8080/#/c/16527/1/src/kudu/transactions/txn_system_client.cc@62 PS1, Line 62: if (master_addrs.empty()) { : return Status::InvalidArgument("empty list of master addresses"); : } Can we get the non-empty list from the Master directly, assuming this is only empty if the --master_addrs field is empty? -- To view, visit http://gerrit.cloudera.org:8080/16527 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie952977a3ae5f625d1283389f0be8afb79df7d8c Gerrit-Change-Number: 16527 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Thu, 01 Oct 2020 22:42:06 +0000 Gerrit-HasComments: Yes