Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13426 )

Change subject: KUDU-2797: the master aggregates tablet statistics
......................................................................


Patch Set 15:

(5 comments)

Looking good for the most part! I'm less familiar with our metrics framework; 
Mike, could you take a look if you can?

http://gerrit.cloudera.org:8080/#/c/13426/15//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13426/15//COMMIT_MSG@12
PS15, Line 12: In this patch, tablet statistics including disk size and live row
             : counts of all the replicas are aggregated on the master and only
             : the leader ones are summed up and displayed on the 'TablePage'.
             : At the same time, every replica's statistics are displayed on the
             : 'TabletPage'.
Could you also add the detail that these are also exposed as metrics on the 
masters and tablet servers.


http://gerrit.cloudera.org:8080/#/c/13426/15/src/kudu/integration-tests/ts_tablet_manager-itest.cc
File src/kudu/integration-tests/ts_tablet_manager-itest.cc:

http://gerrit.cloudera.org:8080/#/c/13426/15/src/kudu/integration-tests/ts_tablet_manager-itest.cc@632
PS15, Line 632:   while (true) {
              :     NO_FATALS(TriggerHeartbeat());
              :     ASSERT_EQ(0, 
table_info->metrics()->live_row_count->value());
              :     if (table_info->metrics()->on_disk_size->value() > 0) {
              :       break;
              :     }
              :   }
nit: if you use ASSERT_EVENTUALLY, it will do some back-off for you. E.g.

ASSERT_EVENTUALLY([&] {
  NO_FATALS(TriggerHeartbeat());
  ASSERT_EQ(0, table_info->metrics()->live_row_count->value());
  ASSERT_LT(0, table_info->metrics()->on_disk_size->value());
});

Same below.


http://gerrit.cloudera.org:8080/#/c/13426/15/src/kudu/master/master_stats.cc
File src/kudu/master/master_stats.cc:

PS15:
This has more to do with "metrics" than it does "stats". How about calling it 
TableMetrics?


http://gerrit.cloudera.org:8080/#/c/13426/15/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/13426/15/src/kudu/tablet/tablet_replica.cc@845
PS15, Line 845: const ReportedTabletStatsPB& TabletReplica::stats_pb() const {
              :   return stats_pb_;
              : }
On second thought, I think for more obvious thread-safety, this should indeed 
return a copy instead of a cref. And also maybe move stats_lock_ from the 
TsTableManager into this class here instead, rather than taking a server-wide 
lock for each replica.


http://gerrit.cloudera.org:8080/#/c/13426/15/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/13426/15/src/kudu/tserver/ts_tablet_manager.cc@1308
PS15, Line 1308:     lock_stats_.lock_shared();
               :     ReportedTabletStatsPB stats_pb = replica->stats_pb();
               :     lock_stats_.unlock_shared();
The fact that we're taking a lock just for this makes me think the locking 
should actually be in the TabletReplica. See my comment in tablet_replica.cc



--
To view, visit http://gerrit.cloudera.org:8080/13426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74406ab7cca7c22fda455c328b8ee9989a6b2d99
Gerrit-Change-Number: 13426
Gerrit-PatchSet: 15
Gerrit-Owner: helifu <hzhel...@corp.netease.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: helifu <hzhel...@corp.netease.com>
Gerrit-Comment-Date: Mon, 24 Jun 2019 22:20:43 +0000
Gerrit-HasComments: Yes

Reply via email to