helifu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13456 )

Change subject: [tablet] Support accurate count of rows
......................................................................


Patch Set 4:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/13456/4//COMMIT_MSG
Commit Message:

PS4:
> Nice description and ASCII art.
Thanks^_^


http://gerrit.cloudera.org:8080/#/c/13456/4//COMMIT_MSG@7
PS4, Line 7: [tablet] Support accurate count of rows
> How will this work for existing tablets? AFAICT all of the count
 > changes are incremental, which doesn't work at all if the
 > persistent counters are set to 0 in existing tablets.

Yea, it's a big problem indeed. And it seems there is no good way to backward 
compatible. Well, maybe we can give our users some suggestions:
1) reload the data for the dimension tables;
2) roll the fact tables, then the old tablets that have not counters will be 
deleted;
3) if users don't like above ideas, that means they are not interested in this 
feature.
What do you think?


http://gerrit.cloudera.org:8080/#/c/13456/4//COMMIT_MSG@12
PS4, Line 12:   When MRS is being flushed, it will be attached to the 
RowSetTree,
> Technically the MRS was part of the RowSetTree _before_ being flushed too.
Done


http://gerrit.cloudera.org:8080/#/c/13456/4//COMMIT_MSG@14
PS4, Line 14: RowSetMetadataS
> RowSetMetadatas
Done


http://gerrit.cloudera.org:8080/#/c/13456/4//COMMIT_MSG@21
PS4, Line 21: counter3
> Nit: value of counter3
Done


http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/compaction.cc
File src/kudu/tablet/compaction.cc:

http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/compaction.cc@1168
PS4, Line 1168:       // If the REDO is empty, it should not be a DELETE.
              :       if (new_redos_head == nullptr) {
              :         out->increase_valid_row_count();
              :       }
> Are you absolutely sure that new_redos_head != nullptr means the row was de
Yes, because L938~L940 show it's a DELETE when new_redos_head != nullptr. :)


http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/delta_tracker.h
File src/kudu/tablet/delta_tracker.h:

http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/delta_tracker.h@270
PS4, Line 270:   int32_t CountRowsValid() const;
> Nit: add an empty line after.
Done


http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/delta_tracker.h@368
PS4, Line 368:   // For the old DMS which is being flushed, swap its valid row 
count.
             :   // NOTE: this value will be merged to RowSetMetadata after 
finished.
> Nit: rewrite as:
Done


http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/deltamemstore.h
File src/kudu/tablet/deltamemstore.h:

http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/deltamemstore.h@187
PS4, Line 187:   // Number of valid rows in DMS.
> Technically the DMS doesn't allow reinserts, so this value can only be 0 or
Done


http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/diskrowset.h
File src/kudu/tablet/diskrowset.h:

http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/diskrowset.h@225
PS4, Line 225:   void increase_valid_row_count() { valid_row_count_++; }
> The dependency between this and FlushCompactionInput isn't pretty. Instead,
Done


http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/diskrowset.h@282
PS4, Line 282: in this loop.
> Nit: "for the current DRS being written."
Done


http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/memrowset-test.cc
File src/kudu/tablet/memrowset-test.cc:

http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/memrowset-test.cc@825
PS4, Line 825:   ASSERT_EQ(4, mrs->CountRowsValid());
> ASSERT that it's 0 before GenerateTestData?
Done


http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/memrowset.h
File src/kudu/tablet/memrowset.h:

http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/memrowset.h@258
PS4, Line 258:   virtual uint64_t CountRowsValid() const override {
> Google Style Guide says to use int64_t (and assertions) to enforce that a n
Done


http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/metadata.proto
File src/kudu/tablet/metadata.proto:

http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/metadata.proto@50
PS4, Line 50:   optional int32 valid_row_count = 10;
> Even though nothing in RowSetDataPB is documented, could you doc the meanin
Done


http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/rowset_metadata.h
File src/kudu/tablet/rowset_metadata.h:

http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/rowset_metadata.h@228
PS4, Line 228:   void IncreaseValidRowCount(int32_t valid_row_count);
> Nit: Increment is more clear (i.e. we already have methods that start with
Done


http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/rowset_metadata.h@230
PS4, Line 230:   int32_t CountRowsValid() const;
> Name this valid_row_count() since it's a simple accessor. https://google.gi
Done


http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/rowset_metadata.cc
File src/kudu/tablet/rowset_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/rowset_metadata.cc@115
PS4, Line 115:   if (pb.has_valid_row_count()) {
             :     valid_row_count_ = pb.valid_row_count();
             :   }
> Likewise, move this to the end; it looks like it's part of "Load redo delta
Done


http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/rowset_metadata.cc@143
PS4, Line 143:   if (valid_row_count_ != 0) {
             :     pb->set_valid_row_count(valid_row_count_);
             :   }
> Nit: move this to the end of the function. As is it looks like it has to do
Done


http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/13456/4/src/kudu/tablet/tablet.h@350
PS4, Line 350:   // Returns the valid row count in this tablet.
> Doc that "valid" means rows that haven't been deleted. Do you think "live"
Done



--
To view, visit http://gerrit.cloudera.org:8080/13456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e6378e289bb85024c29e96c2b153fc417ed6412
Gerrit-Change-Number: 13456
Gerrit-PatchSet: 4
Gerrit-Owner: helifu <hzhel...@corp.netease.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: helifu <hzhel...@corp.netease.com>
Gerrit-Comment-Date: Fri, 31 May 2019 17:37:02 +0000
Gerrit-HasComments: Yes

Reply via email to