helifu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13426 )

Change subject: KUDU-2797: the master aggregates tablet metrics
......................................................................


Patch Set 5:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/master/catalog_manager.h@189
PS5, Line 189:   ReportedTabletMetricsPB peer_metrics_return();
> +1
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/master/catalog_manager.h@189
PS5, Line 189:   ReportedTabletMetricsPB peer_metrics_return();
> There may be multiple replica metrics associated with a tablet. Which gets
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/master/catalog_manager.h@187
PS5, Line 187:   // Upsert or delete the metrics.
             :   void peer_metrics_upsert(const std::string& peer, const 
ReportedTabletMetricsPB& metrics);
             :   ReportedTabletMetricsPB peer_metrics_return();
             :   void peer_metrics_delete(const std::string& peer);
> Looking at the other functions of the CatalogManager, perhaps these names w
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/master/catalog_manager.cc@5243
PS5, Line 5243: peer_metrics_upsert
> In newer code, we've generally used "replica" instead of "peer" when referr
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/master/master.proto@233
PS5, Line 233: MetricsPB
> I don't love the idea that this is called a "metric", since I think this co
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/master/master.proto@233
PS5, Line 233: MetricsPB
> +1, statistics seems more descriptive of what we're building in here
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/tserver/heartbeater.h
File src/kudu/tserver/heartbeater.h:

http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/tserver/heartbeater.h@77
PS5, Line 77:   void SetUpdateTimeForTests(const MonoTime& time);
> nit: If possible, make the ForTests() stuff private and add the test as a f
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/tserver/heartbeater.h@76
PS5, Line 76:   const MonoTime GetUpdateTime();
            :   void SetUpdateTimeForTests(const MonoTime& time);
            :   void UpdateTabletMetrics();
            :   Status GetTabletMetricsPB(const std::string& tablet_id, 
master::ReportedTabletMetricsPB* pb);
> Could you add comments describing these and how a user of the Heartbeater c
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/tserver/heartbeater.cc
File src/kudu/tserver/heartbeater.cc:

http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/tserver/heartbeater.cc@318
PS5, Line 318: // 1.
> nit: Could you remove these numbers? This function is short enough that it'
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/tserver/heartbeater.cc@319
PS5, Line 319: lock_.try_lock()
> Please ignore my question about the threads, I left that as a note to mysel
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/tserver/heartbeater.cc@319
PS5, Line 319: lock_.try_lock()
> nit: this would be more resistant to bugs caused by future code changes if
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/tserver/heartbeater.cc@344
PS5, Line 344: for (auto it = tablet_metrics.begin(); it != 
tablet_metrics.end(); ++it) {
> Should we remove metrics that don't exist in tablet_metrics (i.e. tablets t
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/tserver/heartbeater.cc@343
PS5, Line 343:   vector<string> dirty_tablets;
             :   for (auto it = tablet_metrics.begin(); it != 
tablet_metrics.end(); ++it) {
             :     auto iter = tablet_metrics_.find(it->first);
             :     if (iter != tablet_metrics_.end() &&
             :         iter->second->disk_size() == it->second->disk_size() &&
             :         iter->second->row_count() == it->second->row_count()) {
             :       continue;
             :     }
             :
             :     tablet_metrics_[it->first] = std::move(it->second);
             :     dirty_tablets.emplace_back(it->first);
             :   }
> It's hard to follow this because of how similar 'iter' and 'it' are. Maybe
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/tserver/heartbeater.cc@357
PS5, Line 357:   update_time_ = MonoTime::Now() +
             :       
MonoDelta::FromMilliseconds(FLAGS_update_tablet_metrics_interval_ms);
> nit: I think it'd be more intuitive to keep around a 'time_of_last_update_'
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/tserver/heartbeater.cc@642
PS5, Line 642:     // If that time expires, let's update the tablet metrics.
             :     if (MonoTime::Now() > parent_->GetUpdateTime()) {
             :       parent_->UpdateTabletMetrics();
             :     }
> How about encapsulating this check into the function itself, e.g. UpdateTab
Done


http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/tserver/ts_tablet_manager-test.cc
File src/kudu/tserver/ts_tablet_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/13426/5/src/kudu/tserver/ts_tablet_manager-test.cc@338
PS5, Line 338: 1.
> nit: when formatting numbering, please add a space after the dot, like:
Done



--
To view, visit http://gerrit.cloudera.org:8080/13426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74406ab7cca7c22fda455c328b8ee9989a6b2d99
Gerrit-Change-Number: 13426
Gerrit-PatchSet: 5
Gerrit-Owner: helifu <hzhel...@corp.netease.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: helifu <hzhel...@corp.netease.com>
Gerrit-Comment-Date: Tue, 11 Jun 2019 13:39:30 +0000
Gerrit-HasComments: Yes

Reply via email to