Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12323 )

Change subject: KUDU-2670: Part 1: Build ScanToken by splitSizeBytes
......................................................................


Patch Set 6:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java
File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java:

http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@1208
PS6, Line 1208:    * @param partitionKey the partition key of the tablet to find
Not clear why this is necessary. As far as I can tell it's for proper retrying 
support of the RPC; could you clarify in the comment?


http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@1209
PS6, Line 1209:    * @param splitSizeBytes the number of bytes to try to return 
in each key range.
The wording here suggests that this value is advisory (i.e. the server might 
return smaller or larger ranges). If that's the case, could you explicitly 
state that?


http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@1211
PS6, Line 1211:    * @return a deferred object that indicates the completion of 
the request.
Nit: "a {@code Deferred} object which yields the list of key ranges."


http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@1213
PS6, Line 1213:   Deferred<SplitKeyRangeResponse> getTabletKeyRanges(final 
KuduTable table,
Is there any use case for applications to call this API directly? If so, should 
there be a synchronous equivalent in KuduClient?


http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/KeyRange.java
File java/kudu-client/src/main/java/org/apache/kudu/client/KeyRange.java:

http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/KeyRange.java@23
PS6, Line 23:  * A KeyRange describes the range in Tablet.
Should document that startPrimaryKey and endPrimaryKey represent encoded 
primary keys.


http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/KeyRange.java@31
PS6, Line 31:   public KeyRange(byte[] startPrimaryKey, byte[] endPrimaryKey, 
Long bytes) {
Not clear what 'bytes' means here.


http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
File java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java:

http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java@304
PS6, Line 304:      * Sets a size of split tablet to use when building the list 
of scan tokens.
Again, careful when referencing "split tablet"; it makes it sound as if the 
tablet is being physically split.


http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java@408
PS6, Line 408:         return splitSizeBytes > 0 ? buildTokensBySplit(tablets, 
proto) : buildTokensByDefault(tablets, proto);
Nit: wrap


http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java@428
PS6, Line 428:     public List<KuduScanToken> 
buildTokensBySplit(List<LocatedTablet> tablets,
private


http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/SplitKeyRangeRequest.java
File 
java/kudu-client/src/main/java/org/apache/kudu/client/SplitKeyRangeRequest.java:

http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/SplitKeyRangeRequest.java@33
PS6, Line 33:  * RPC to split tablet into multiple primary key ranges by size.
One of the confusing aspects of this feature is that tablets aren't actually 
being "split" in the sense that after the RPC finishes, there are two tablets 
instead of one. We need to be careful when documenting that, to avoid sending 
the wrong message.

In this case, you could talk about how the tserver responds to the RPC by 
splitting a tablet's primary key range into smaller ranges suitable for 
concurrent scanning.


http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/SplitKeyRangeRequest.java@38
PS6, Line 38:   private final byte[] startPrimaryKey;
            :   private final byte[] endPrimaryKey;
            :   private final byte[] partitionKey;
            :   private final Long splitSizeBytes;
Document the meaning of these fields.


http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/main/java/org/apache/kudu/client/SplitKeyRangeRequest.java@43
PS6, Line 43:
Nit: unnecessary empty line here.


http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/test/java/org/apache/kudu/util/SplitTestUtil.java
File java/kudu-client/src/test/java/org/apache/kudu/util/SplitTestUtil.java:

PS6:
License header.


http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-client/src/test/java/org/apache/kudu/util/SplitTestUtil.java@104
PS6, Line 104:     for (int i=1; i<numbers.size() - 1;i+=2){
for (int i = 1; i < numbers.size() - 1; i += 2) {


http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduReadOptions.scala
File 
java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduReadOptions.scala:

http://gerrit.cloudera.org:8080/#/c/12323/6/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduReadOptions.scala@37
PS6, Line 37:  * @param splitSizeBytes Set the number of bytes used to split 
tablet.
> Perhaps make these docs a bit more spark centric. Something along the lines
Agreed with Grant, though we need to be careful when we talk about "splitting 
tablets"; we want to emphasize that the tablets _themselves_ are not being 
split. Only the range of primary keys to be scanned is being split.



--
To view, visit http://gerrit.cloudera.org:8080/12323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0502f5d64569e8b1d45e88de3cb36aa2e01234d0
Gerrit-Change-Number: 12323
Gerrit-PatchSet: 6
Gerrit-Owner: yangz <zhe...@gmail.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yao Xu <ocla...@gmail.com>
Gerrit-Reviewer: yangz <zhe...@gmail.com>
Gerrit-Comment-Date: Tue, 05 Mar 2019 23:59:17 +0000
Gerrit-HasComments: Yes

Reply via email to