Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11381 )

Change subject: KUDU-2566: fix two todo list
......................................................................


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/11381/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11381/2//COMMIT_MSG@7
PS2, Line 7: fix two todo list
Could you make the subject more descriptive? Howabout "[tablet] KUDU-2566:  
Enhance rowset tree pruning"?


http://gerrit.cloudera.org:8080/#/c/11381/2//COMMIT_MSG@10
PS2, Line 10: support open-ended intervals
Following good commit message guidelines, could you explain what the previous 
behavior was, and how this patch changes it? Maybe:

Previously, the rowset tree only could only compute overlap with a closed and 
bounded interval. This changeset adds the ability to find overlap for unbounded 
intervals as well. As a result, tablet iterators with a single primary key 
bound are more efficient because they do not iterate over rowsets that don't 
overlap with the key bound.


http://gerrit.cloudera.org:8080/#/c/11381/2//COMMIT_MSG@11
PS2, Line 11: end up fetching one more rowset for the upper bound which is 
exclusive
Again, let's fill out the explanation here a bit. My suggestion (adding on to 
the previous suggestion):

This changeset also adds the ability to query the rowset tree using an 
exclusive upper bound, whereas previously only inclusive intervals were 
supported. This makes scans more efficient since primary key upper bounds are 
passed as exclusive bounds, so now rowsets that overlap only in the upper bound 
are excluded.


http://gerrit.cloudera.org:8080/#/c/11381/2/src/kudu/tablet/rowset_tree.h
File src/kudu/tablet/rowset_tree.h:

http://gerrit.cloudera.org:8080/#/c/11381/2/src/kudu/tablet/rowset_tree.h@88
PS2, Line 88:   void FindRowSetsIntersectingInterval(const Slice &lower_bound,
            :                                        const Slice &upper_bound,
            :                                        std::vector<RowSet *> 
*rowsets) const;
            :   void FindRowSetsIntersectingIntervalGE(const Slice& lower_bound,
            :                                          std::vector<RowSet*> 
*rowsets) const;
            :   void FindRowSetsIntersectingIntervalLT(const Slice& upper_bound,
            :                                          std::vector<RowSet*> 
*rowsets) const;
> Rather than introducing new FindRowSetsIntersectingInterval... methods, cou
+1. We could also accept boost::none for both arguments and short-circuit 
return all rowsets, which would allow the logic in tablet.cc to be very simple 
(because it's pushed into the rowset tree).


http://gerrit.cloudera.org:8080/#/c/11381/2/src/kudu/util/interval_tree.h
File src/kudu/util/interval_tree.h:

http://gerrit.cloudera.org:8080/#/c/11381/2/src/kudu/util/interval_tree.h@140
PS2, Line 140:   // Greater than or equal.
             :   // The 'query' is the lower_bound.
             :   template<class QueryPointType>
             :   void FindIntersectingIntervalGE(const QueryPointType& query,
             :                                   IntervalVector *results) const;
             :   // Less than or equal.
             :   // The 'query' is the upper_bound.
             :   template<class QueryPointType>
             :   void FindIntersectingIntervalLT(const QueryPointType &query,
             :                                   IntervalVector *results) const;
             :
> Rather than introduce these new methods, would it be possible to modify int
+1 to the idea though I think it might suffice to add an additional 
QueryIntervalType analogous to QueryPointType, and have the QueryIntervalType 
be able to be unbounded. The interval tree stores objects of type 
interval_type, so generalizing this type to support open-ended ranges means the 
interval tree needs to support storing open ended intervals, which is not 
necessary for the rowset tree.

In particular, the rowset tree's interval_type is RowsetWithBounds, which 
appears ot have been optimized for the rowset compaction algorithm, so there's 
other perf implications for changing it to be able to describe open-ended 
intervals..



--
To view, visit http://gerrit.cloudera.org:8080/11381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e34b4169f93f3519f694c9bf86ca5295c318e79
Gerrit-Change-Number: 11381
Gerrit-PatchSet: 2
Gerrit-Owner: helifu <hzhel...@corp.netease.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-Comment-Date: Wed, 05 Sep 2018 15:30:40 +0000
Gerrit-HasComments: Yes

Reply via email to