Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13426 )
Change subject: KUDU-2797: the master aggregates tablet statistics ...................................................................... Patch Set 11: (1 comment) Haven't looked in depth, but I left feedback on the new approach. http://gerrit.cloudera.org:8080/#/c/13426/11/src/kudu/tablet/tablet_replica.cc File src/kudu/tablet/tablet_replica.cc: http://gerrit.cloudera.org:8080/#/c/13426/11/src/kudu/tablet/tablet_replica.cc@835 PS11, Line 835: pb.set_rows_inserted(tablet->metrics()->rows_inserted->value()); : pb.set_rows_upserted(tablet->metrics()->rows_upserted->value()); : pb.set_rows_updated(tablet->metrics()->rows_updated->value()); : pb.set_rows_deleted(tablet->metrics()->rows_deleted->value()); : pb.set_insertions_failed_dup_key( : tablet->metrics()->insertions_failed_dup_key->value()); : pb.set_upserts_as_updates(tablet->metrics()->upserts_as_updates->value()); : pb.set_scanner_rows_returned(tablet->metrics()->scanner_rows_returned->value()); : pb.set_scanner_cells_returned(tablet->metrics()->scanner_cells_returned->value()); : pb.set_scanner_bytes_returned(tablet->metrics()->scanner_bytes_returned->value()); : pb.set_scanner_rows_scanned(tablet->metrics()->scanner_rows_scanned->value()); : pb.set_scanner_cells_scanned_from_disk( : tablet->metrics()->scanner_cells_scanned_from_disk->value()); : pb.set_scanner_bytes_scanned_from_disk( : tablet->metrics()->scanner_bytes_scanned_from_disk->value()); : pb.set_scans_started(tablet->metrics()->scans_started->value()); : pb.set_bloom_lookups(tablet->metrics()->bloom_lookups->value()); : pb.set_key_file_lookups(tablet->metrics()->key_file_lookups->value()); : pb.set_delta_file_lookups(tablet->metrics()->delta_file_lookups->value()); : pb.set_mrs_lookups(tablet->metrics()->mrs_lookups->value()); : pb.set_bytes_flushed(tablet->metrics()->bytes_flushed->value()); : pb.set_undo_delta_block_gc_bytes_deleted( : tablet->metrics()->undo_delta_block_gc_bytes_deleted->value()); : pb.set_mem_rowset_size(tablet->MemRowSetSize()); : pb.set_on_disk_data_size(tablet->OnDiskDataSize()); : pb.set_num_rowsets(tablet->num_rowsets()); : pb.set_on_disk_size(OnDiskSize()); : pb.set_live_row_count(live_row_count); I'd advocate for _not_ sending back every single one of these metrics, and limiting the scope of TabletStatsPB to just live row count and on-disk size. AFAICT the goals of this patch are to: - make it convenient to get the size of a table, which is a pretty common task - lay the groundwork for more complex statistics I worry that sending _all_ tablet metrics like this will: - not be useful very frequently (beyond just data size), e.g. it doesn't seem particularly useful to know about the number of rowsets across a table, vs in a single tablet replica, so I'm not sure if there's much value added for aggregating them on the master - be somewhat expensive if sent for thousands of tablets. Plus, since we're sending so many metrics back, this will likely make tablets dirty way more frequently (e.g. a tablet will be dirty if it gets scanned, if it compacts at all, if it GCs WALs, etc.) -- To view, visit http://gerrit.cloudera.org:8080/13426 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I74406ab7cca7c22fda455c328b8ee9989a6b2d99 Gerrit-Change-Number: 13426 Gerrit-PatchSet: 11 Gerrit-Owner: helifu <hzhel...@corp.netease.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: helifu <hzhel...@corp.netease.com> Gerrit-Comment-Date: Wed, 19 Jun 2019 16:58:26 +0000 Gerrit-HasComments: Yes