Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16527 )

Change subject: KUDU-2612: initial implementation of TxnManager
......................................................................


Patch Set 5:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/16527/3/src/kudu/master/master.cc
File src/kudu/master/master.cc:

http://gerrit.cloudera.org:8080/#/c/16527/3/src/kudu/master/master.cc@97
PS3, Line 97: true
> Yea, in general I think it's a good practice to have feature flags but keep
OK, done -- this flag is set to 'false' by default.


http://gerrit.cloudera.org:8080/#/c/16527/3/src/kudu/master/master.cc@318
PS3, Line 318: retry_interval
> Seems PS5 missed this
Whoops, indeed.  But I remember I changed this.

Anyways, now this should be fixed.


http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/master.cc
File src/kudu/master/master.cc:

http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/master.cc@103
PS5, Line 103: Whether to initialize TxnManager upon arrival of first request.
> Wondering how to decide if TxnManager should be initialized upon arrival of
I guess for real world scenarios in clusters which use transactions we should 
prefer non-lazy initialization.  The lazy initialization mode helps to get away 
with too many (60+) broken tests otherwise.  At some point we should introduce 
proper tablet filtering or explicitly set --txn_manager_lazily_initialized=true 
in those broken tests, and change this flag to be 'false' by default.

I added corresponding comment.


http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/master.cc@319
PS5, Line 319: KLOG_EVERY_N_SECS(WARNING, 60) << Substitute(
             :         "$0: unable to init TxnManager, will retry in $1",
             :         s.ToString(), retry_interval.ToString());
             :     SleepFor(retry_interval);
> Why do we want to retry the initialization, but not as what catalog manager
TxnManager cannot complete initialization without tablet servers which host the 
txn status table's tablets.  However, master can be started when no tablet 
servers are running.

If the only mode that TxnManager would be running with were lazy 
initialization, not retrying this task might be an option, but I guess in real 
world scenarios we should prefer non-lazy initialization, hence the retry logic.

Also, current behavior makes it much easier to handle waiting on the 
txn_manager_init_status_ promise.

I added a comment on the rationale behind this task's behavior.


http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.h
File src/kudu/master/txn_manager.h:

http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.h@102
PS5, Line 102: static constexpr int64_t kTxnTableRangeStep = 1024;
> Can you add a comment to explain why 1024 is chosen?
There is nothing specific about this, it's just a random number so far.  I 
moved this to be a flag and added description for the flag.


http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.cc
File src/kudu/master/txn_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.cc@49
PS5, Line 49: If 's' is not OK and 'resp' has no application specific error set
> Do you know in which cases this will happen?
It happens in case of RPC-level errors.


http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.cc@127
PS5, Line 127: // This method isn't supposed to be called concurrently, so 
there isn't any
             : // protection against concurrent calls.
> Should this be moved to the method declaration of the header.
The header has similar comment as well.  I put this comment here as well for 
better readability.


http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.proto
File src/kudu/master/txn_manager.proto:

http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.proto@94
PS5, Line 94: Whether
> nit: "What the transaction state was.."
Replaced with: The transaction state at the time of processing the request


http://gerrit.cloudera.org:8080/#/c/16527/5/src/kudu/master/txn_manager.proto@94
PS5, Line 94: was
> nit: was OK?
I replaced the whole sentence with 'The transaction state at the time of 
processing the request'



--
To view, visit http://gerrit.cloudera.org:8080/16527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie952977a3ae5f625d1283389f0be8afb79df7d8c
Gerrit-Change-Number: 16527
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 14 Oct 2020 04:11:50 +0000
Gerrit-HasComments: Yes

Reply via email to