Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16527 )
Change subject: KUDU-2612: initial implementation of TxnManager ...................................................................... Patch Set 5: (9 comments) http://gerrit.cloudera.org:8080/#/c/16527/3/src/kudu/master/master.cc File src/kudu/master/master.cc: http://gerrit.cloudera.org:8080/#/c/16527/3/src/kudu/master/master.cc@97 PS3, Line 97: true > Yea, in general I think it's a good practice to have feature flags but keep OK, done -- this flag is set to 'false' by default. http://gerrit.cloudera.org:8080/#/c/16527/3/src/kudu/master/master.cc@318 PS3, Line 318: retry_interval > Seems PS5 missed this Whoops, indeed. But I remember I changed this. Anyways, now this should be fixed. http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/master.cc File src/kudu/master/master.cc: http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/master.cc@103 PS5, Line 103: Whether to initialize TxnManager upon arrival of first request. > Wondering how to decide if TxnManager should be initialized upon arrival of I guess for real world scenarios in clusters which use transactions we should prefer non-lazy initialization. The lazy initialization mode helps to get away with too many (60+) broken tests otherwise. At some point we should introduce proper tablet filtering or explicitly set --txn_manager_lazily_initialized=true in those broken tests, and change this flag to be 'false' by default. I added corresponding comment. http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/master.cc@319 PS5, Line 319: KLOG_EVERY_N_SECS(WARNING, 60) << Substitute( : "$0: unable to init TxnManager, will retry in $1", : s.ToString(), retry_interval.ToString()); : SleepFor(retry_interval); > Why do we want to retry the initialization, but not as what catalog manager TxnManager cannot complete initialization without tablet servers which host the txn status table's tablets. However, master can be started when no tablet servers are running. If the only mode that TxnManager would be running with were lazy initialization, not retrying this task might be an option, but I guess in real world scenarios we should prefer non-lazy initialization, hence the retry logic. Also, current behavior makes it much easier to handle waiting on the txn_manager_init_status_ promise. I added a comment on the rationale behind this task's behavior. http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.h File src/kudu/master/txn_manager.h: http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.h@102 PS5, Line 102: static constexpr int64_t kTxnTableRangeStep = 1024; > Can you add a comment to explain why 1024 is chosen? There is nothing specific about this, it's just a random number so far. I moved this to be a flag and added description for the flag. http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.cc File src/kudu/master/txn_manager.cc: http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.cc@49 PS5, Line 49: If 's' is not OK and 'resp' has no application specific error set > Do you know in which cases this will happen? It happens in case of RPC-level errors. http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.cc@127 PS5, Line 127: // This method isn't supposed to be called concurrently, so there isn't any : // protection against concurrent calls. > Should this be moved to the method declaration of the header. The header has similar comment as well. I put this comment here as well for better readability. http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.proto File src/kudu/master/txn_manager.proto: http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.proto@94 PS5, Line 94: Whether > nit: "What the transaction state was.." Replaced with: The transaction state at the time of processing the request http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.proto@94 PS5, Line 94: was > nit: was OK? I replaced the whole sentence with 'The transaction state at the time of processing the request' -- To view, visit http://gerrit.cloudera.org:8080/16527 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie952977a3ae5f625d1283389f0be8afb79df7d8c Gerrit-Change-Number: 16527 Gerrit-PatchSet: 5 Gerrit-Owner: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Wed, 14 Oct 2020 04:11:50 +0000 Gerrit-HasComments: Yes