helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/13456 )
Change subject: [tablet] Support accurate count of rows ...................................................................... Patch Set 4: (19 comments) http://gerrit.cloudera.org:8080/#/c/13456/4//COMMIT_MSG Commit Message: PS4: > Nice description and ASCII art. Thanks^_^ http://gerrit.cloudera.org:8080/#/c/13456/4//COMMIT_MSG@7 PS4, Line 7: [tablet] Support accurate count of rows > How will this work for existing tablets? AFAICT all of the count > changes are incremental, which doesn't work at all if the > persistent counters are set to 0 in existing tablets. Yea, it's a big problem indeed. And it seems there is no good way to backward compatible. Well, maybe we can give our users some suggestions: 1) reload the data for the dimension tables; 2) roll the fact tables, then the old tablets that have not counters will be deleted; 3) if users don't like above ideas, that means they are not interested in this feature. What do you think? http://gerrit.cloudera.org:8080/#/c/13456/4//COMMIT_MSG@12 PS4, Line 12: When MRS is being flushed, it will be attached to the RowSetTree, > Technically the MRS was part of the RowSetTree _before_ being flushed too. Done http://gerrit.cloudera.org:8080/#/c/13456/4//COMMIT_MSG@14 PS4, Line 14: RowSetMetadataS > RowSetMetadatas Done http://gerrit.cloudera.org:8080/#/c/13456/4//COMMIT_MSG@21 PS4, Line 21: counter3 > Nit: value of counter3 Done http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/compaction.cc File src/kudu/tablet/compaction.cc: http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/compaction.cc@1168 PS4, Line 1168: // If the REDO is empty, it should not be a DELETE. : if (new_redos_head == nullptr) { : out->increase_valid_row_count(); : } > Are you absolutely sure that new_redos_head != nullptr means the row was de Yes, because L938~L940 show it's a DELETE when new_redos_head != nullptr. :) http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/delta_tracker.h File src/kudu/tablet/delta_tracker.h: http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/delta_tracker.h@270 PS4, Line 270: int32_t CountRowsValid() const; > Nit: add an empty line after. Done http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/delta_tracker.h@368 PS4, Line 368: // For the old DMS which is being flushed, swap its valid row count. : // NOTE: this value will be merged to RowSetMetadata after finished. > Nit: rewrite as: Done http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/deltamemstore.h File src/kudu/tablet/deltamemstore.h: http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/deltamemstore.h@187 PS4, Line 187: // Number of valid rows in DMS. > Technically the DMS doesn't allow reinserts, so this value can only be 0 or Done http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/diskrowset.h File src/kudu/tablet/diskrowset.h: http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/diskrowset.h@225 PS4, Line 225: void increase_valid_row_count() { valid_row_count_++; } > The dependency between this and FlushCompactionInput isn't pretty. Instead, Done http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/diskrowset.h@282 PS4, Line 282: in this loop. > Nit: "for the current DRS being written." Done http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/memrowset-test.cc File src/kudu/tablet/memrowset-test.cc: http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/memrowset-test.cc@825 PS4, Line 825: ASSERT_EQ(4, mrs->CountRowsValid()); > ASSERT that it's 0 before GenerateTestData? Done http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/memrowset.h File src/kudu/tablet/memrowset.h: http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/memrowset.h@258 PS4, Line 258: virtual uint64_t CountRowsValid() const override { > Google Style Guide says to use int64_t (and assertions) to enforce that a n Done http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/metadata.proto File src/kudu/tablet/metadata.proto: http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/metadata.proto@50 PS4, Line 50: optional int32 valid_row_count = 10; > Even though nothing in RowSetDataPB is documented, could you doc the meanin Done http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/rowset_metadata.h File src/kudu/tablet/rowset_metadata.h: http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/rowset_metadata.h@228 PS4, Line 228: void IncreaseValidRowCount(int32_t valid_row_count); > Nit: Increment is more clear (i.e. we already have methods that start with Done http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/rowset_metadata.h@230 PS4, Line 230: int32_t CountRowsValid() const; > Name this valid_row_count() since it's a simple accessor. https://google.gi Done http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/rowset_metadata.cc File src/kudu/tablet/rowset_metadata.cc: http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/rowset_metadata.cc@115 PS4, Line 115: if (pb.has_valid_row_count()) { : valid_row_count_ = pb.valid_row_count(); : } > Likewise, move this to the end; it looks like it's part of "Load redo delta Done http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/rowset_metadata.cc@143 PS4, Line 143: if (valid_row_count_ != 0) { : pb->set_valid_row_count(valid_row_count_); : } > Nit: move this to the end of the function. As is it looks like it has to do Done http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/tablet.h File src/kudu/tablet/tablet.h: http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/tablet.h@350 PS4, Line 350: // Returns the valid row count in this tablet. > Doc that "valid" means rows that haven't been deleted. Do you think "live" Done -- To view, visit http://gerrit.cloudera.org:8080/13456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2e6378e289bb85024c29e96c2b153fc417ed6412 Gerrit-Change-Number: 13456 Gerrit-PatchSet: 4 Gerrit-Owner: helifu <hzhel...@corp.netease.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu <hzhel...@corp.netease.com> Gerrit-Comment-Date: Fri, 31 May 2019 17:37:02 +0000 Gerrit-HasComments: Yes