Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8090 )

Change subject: KUDU-1125: issue one catalog write per tablet report
......................................................................


Patch Set 6:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8090/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8090/6//COMMIT_MSG@10
PS6, Line 10: When the
            : master is configured to fsync WAL writes, this can add a lot of 
load during
            : election storms and when the master is restarted.
> any chance you did a before/after test here? eg start a small cluster with
Haven't done any perf testing yet. I was planning to do it post-merge, but I'll 
tackle it now instead.

I'll also try to experiment to see whether your estimate for max tablets is 
correct. It's tough because although it's pretty easy to trigger a full report 
(restart the master or a tserver), the underlying write performed by that 
report is likely to be sparse because 3b6ab64 will filter out all tablets that 
haven't actually changed. Meanwhile, events that do yield real changes (such as 
an election storm or the creation of a table) tend to be staggered over time. 
In short, it's hard to create a full report with nothing but real changes.


http://gerrit.cloudera.org:8080/#/c/8090/6/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/8090/6/src/kudu/master/catalog_manager.cc@3106
PS6, Line 3106:                "num_tablets", 
full_report.updated_tablets_size());
> would be nice to also use TRACE_COUNTER_INCREMENT("reported_tablets", updat
Done


http://gerrit.cloudera.org:8080/#/c/8090/6/src/kudu/master/catalog_manager.cc@3122
PS6, Line 3122:   unordered_map<string, ReportedTabletUpdatesPB*> updates;
> worth adding to the comment explaining ownership of these raw pointers
Done


http://gerrit.cloudera.org:8080/#/c/8090/6/src/kudu/master/catalog_manager.cc@3151
PS6, Line 3151:     ReportedTabletUpdatesPB* update = 
full_report_update->add_tablets();
> to cut down on small allocations here, consider doing full_report_update->m
Done


http://gerrit.cloudera.org:8080/#/c/8090/6/src/kudu/master/catalog_manager.cc@3157
PS6, Line 3157:       shared_lock<LockType> l(lock_);
> I think this would likely perform better if you can acquire this once for t
I think I can do that without much refactoring, just need a new scope here.


http://gerrit.cloudera.org:8080/#/c/8090/6/src/kudu/master/catalog_manager.cc@3227
PS6, Line 3227: bootstrapping
> should this now say "copying"?
Yeah. The original wording was "This prevents us from spuriously deleting 
replicas that have just been added to a pending config and are in the process 
of catching up to the log entry where they were added to the config."

Dan asked me whether it should say "just been added to the committed config and 
are in the process of bootstrapping" and I misinterpreted his suggestion as 
"change the wording to this", though that wasn't his intent.


http://gerrit.cloudera.org:8080/#/c/8090/6/src/kudu/master/catalog_manager.cc@3249
PS6, Line 3249: with an error
> nit: rephrase as "non-deleted tablet which is reporting an error" or somesu
Done


http://gerrit.cloudera.org:8080/#/c/8090/6/src/kudu/master/catalog_manager.cc@3268
PS6, Line 3268:       if (!cstate.committed_config().has_opid_index()) {
> nit: would it be possible to move this if statement up a few lines before c
Yeah, I think that'd be safe.


http://gerrit.cloudera.org:8080/#/c/8090/6/src/kudu/master/catalog_manager.cc@3285
PS6, Line 3285:       // TODO(adar): ShouldTransitionTabletToRunning doesn't 
take step 7b into
              :       // account. Should it?
> not sure I understand this question -- isn't the condition from ShouldTrans
This TODO is trying to draw attention to the fact that 
ShouldTransitionTabletToRunning when examining the replica's ConsensusStatePB, 
goes straight to report rather than looking at the local 'cstate' (which may 
have had its leader_uuid() cleared).

I am comfortable changing it to use 'cstate', but I wasn't sure whether that'd 
be correct.


http://gerrit.cloudera.org:8080/#/c/8090/6/src/kudu/master/catalog_manager.cc@3344
PS6, Line 3344:         ConsensusStatePB prev_cstate = 
tablet->metadata().state().pb.consensus_state();
> this is a little confusing - this is shadowing a variable with the same nam
Ugh, yes. Given the COW nature of the persistent data, it should be safe to 
reuse the existing prev_cstate even though we're mutating the cstate.



--
To view, visit http://gerrit.cloudera.org:8080/8090
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6f5cf0e4b1cd1160b3b310d89c6dbf3dd62e43b
Gerrit-Change-Number: 8090
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Tue, 03 Oct 2017 01:08:42 +0000
Gerrit-HasComments: Yes

Reply via email to