Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13456 )
Change subject: [tablet] Support accurate count of rows ...................................................................... Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/13456/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13456/6//COMMIT_MSG@32 PS6, Line 32: Jjust Just http://gerrit.cloudera.org:8080/#/c/13456/6//COMMIT_MSG@33 PS6, Line 33: 8.For the historical tablets: : When a negative number is returned, it means it is a historical : tablet. And a historical tablet can return a positive number of : live rows after compaction and etc. This isn't as clear as it could be. It's important to add that a tablet will only transition from "historical" to "regular" (i.e. with a real live row count) after every single DRS has been rewritten. Meaning, as long as there's one historical DRS, the overall tablet live row count will be -1. BTW, do you think this is better than a superblock-level capability that's set for new tablets? Do you really expect tablets to eventually be fully recompacted such that every DRS has a correct live row count? I wouldn't expect that out of most workloads: often times there are sections of keyspace that are "ancient" and are probably already fully compacted. Speaking of -1, could we have the chain of functions involved in getting the live row count return a Status so that we can convert a -1 into Status::NotSupported? I think it'll be clearer for callers. http://gerrit.cloudera.org:8080/#/c/13456/6/src/kudu/tablet/delta_tracker.h File src/kudu/tablet/delta_tracker.h: http://gerrit.cloudera.org:8080/#/c/13456/6/src/kudu/tablet/delta_tracker.h@269 PS6, Line 269: // Count the number of deleted rows in this Tracker. The comment suggests that it'll count up the rows in the entire tracker, (i.e. also in the UNDOs and REDOs). But that's not accurate: it only returns deleted rows in the current DMS as well as in a flushing DMS (if one exists). http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/memrowset.h File src/kudu/tablet/memrowset.h: http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/memrowset.h@258 PS4, Line 258: virtual int64_t CountLiveRows() const override { > Done Having done this, could you add DCHECKs to the various counting/subtraction functions that ensure that the row count never becomes negative? http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/rowset_metadata.h File src/kudu/tablet/rowset_metadata.h: http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/rowset_metadata.h@228 PS4, Line 228: // Note: positive number means live rows and negative number means deleted rows. > Done Sorry, I didn't mean to suggest that you should rename methods like these "Increment", but rather to "IncrementValidRowCount" instead of "IncreaseValidRowCount". Without the "ValidRowCount" part the name is too ambiguous. -- To view, visit http://gerrit.cloudera.org:8080/13456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2e6378e289bb85024c29e96c2b153fc417ed6412 Gerrit-Change-Number: 13456 Gerrit-PatchSet: 6 Gerrit-Owner: helifu <hzhel...@corp.netease.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu <hzhel...@corp.netease.com> Gerrit-Comment-Date: Sun, 02 Jun 2019 22:06:12 +0000 Gerrit-HasComments: Yes