Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13456 )
Change subject: [tablet] Support accurate count of rows ...................................................................... Patch Set 11: (8 comments) Some nits and a testing suggestion, overall LGTM. http://gerrit.cloudera.org:8080/#/c/13456/11//COMMIT_MSG Commit Message: PS11: Some important limitations to be aware of are that the counts will be inaccurate if either: - we are on a Kudu version that supports counting, we downgrade to a version without counts, write some rows, and then upgrade again to a version with counts, or - we tablet copy between a version that supports counts and a version that doesn't support counts, we write rows to the one that doesn't support counts, and then upgrade it I can't think of a clean way around either of these, but I think both of these are edge cases that we generally don't expect anyway. Still important to be aware of them. http://gerrit.cloudera.org:8080/#/c/13456/11//COMMIT_MSG@15 PS11, Line 15: 1. nit: could you add a space between the numbers and the sentences and align them all like that? E.g. 1. At the beginning... reinsert) into... 2. Next, MRS... http://gerrit.cloudera.org:8080/#/c/13456/11/src/kudu/master/sys_catalog.cc File src/kudu/master/sys_catalog.cc: http://gerrit.cloudera.org:8080/#/c/13456/11/src/kudu/master/sys_catalog.cc@255 PS11, Line 255: /*supports_live_row_count=*/ true, Is it important to have this for the SysCatalog tablet? I suppose it's not a whole lot of extra space, but I'm curious what the rationale for it is. http://gerrit.cloudera.org:8080/#/c/13456/11/src/kudu/tablet/delta_tracker.h File src/kudu/tablet/delta_tracker.h: http://gerrit.cloudera.org:8080/#/c/13456/11/src/kudu/tablet/delta_tracker.h@370 PS11, Line 370: : // When the flush completes, this is merged into the RowSetMetadata a nit: strange newline formatting? http://gerrit.cloudera.org:8080/#/c/13456/11/src/kudu/tablet/metadata.proto File src/kudu/tablet/metadata.proto: http://gerrit.cloudera.org:8080/#/c/13456/11/src/kudu/tablet/metadata.proto@138 PS11, Line 138: // Whether the tablet supports live row counting. : // It's only supported for the newly created ones, not for the ancient ones. : // When false, the 'live_row_count' in every RowSetDataPB is incorrect and : // should be ignored. nit: Could you reword this slightly? "Whether the table supports counting live rows. If false, 'live_row_count' may be inaccurate and should be ignored." http://gerrit.cloudera.org:8080/#/c/13456/11/src/kudu/tablet/tablet.cc File src/kudu/tablet/tablet.cc: http://gerrit.cloudera.org:8080/#/c/13456/11/src/kudu/tablet/tablet.cc@1919 PS11, Line 1919: int64_t tmp = 0; : RETURN_NOT_OK(comps->memrowset->CountLiveRows(&tmp)); : *count += tmp; : for (const shared_ptr<RowSet>& rowset : comps->rowsets->all_rowsets()) { : RETURN_NOT_OK(rowset->CountLiveRows(&tmp)); : *count += tmp; : } nit: This will update `count` even if this method fails, which can be surprising. Perhaps rewrite it as: int64_t ret = 0; int64_t tmp = 0; RETURN_NOT_OK(comps->memrowset->CountLiveRows(&ret)); for (const shared_ptr<RowSet>& rowset : comps->rowset->all_rowset()) { RETURN_NOT_OK(rowset->CountLiveRows(&tmp); ret += tmp; } *count = ret; return Status::OK(); so it only gets updated if returning successfully http://gerrit.cloudera.org:8080/#/c/13456/11/src/kudu/tserver/tablet_copy_client-test.cc File src/kudu/tserver/tablet_copy_client-test.cc: http://gerrit.cloudera.org:8080/#/c/13456/11/src/kudu/tserver/tablet_copy_client-test.cc@147 PS11, Line 147: virtual void GenerateTestData() { : Random rand(SeedRandom()); : NO_FATALS(tablet_replica_->tablet_metadata()-> : set_supports_live_row_count_for_tests(rand.Next() % 2)); : NO_FATALS(TabletCopyTest::GenerateTestData()); : } nit: Would be good to add a comment description. http://gerrit.cloudera.org:8080/#/c/13456/11/src/kudu/tserver/tablet_copy_client-test.cc@412 PS11, Line 412: TEST_F(TabletCopyClientTest, TestSupportsLiveRowCount) { : ASSERT_OK(StartCopy()); : ASSERT_EQ(tablet_replica_->tablet_metadata()->supports_live_row_count(), : meta_->supports_live_row_count()); : } Do you know whether it would be easy to verify that row counts as well? I am thinking perhaps something like L773 in tool_action_perf.cc to bootstrap a tablet (with no replica) and verifying the counts match. -- To view, visit http://gerrit.cloudera.org:8080/13456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2e6378e289bb85024c29e96c2b153fc417ed6412 Gerrit-Change-Number: 13456 Gerrit-PatchSet: 11 Gerrit-Owner: helifu <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu <[email protected]> Gerrit-Comment-Date: Wed, 05 Jun 2019 03:21:26 +0000 Gerrit-HasComments: Yes
