Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12715 )

Change subject: [java] Make the KuduScanner iterable
......................................................................


Patch Set 4:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/12715/2/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java
File 
java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java:

http://gerrit.cloudera.org:8080/#/c/12715/2/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java@1030
PS2, Line 1030:               throw new 
NonRecoverableException(statusIncomplete);
> A separate configuration can be defined and passed around by the drivers. I
Right, in this case it sounds like something outside the scan token makes more 
sense, because:
1) It's only an issue for the Java client, whose different memory semantics 
enable this user-configurable tradeoff.
2) It _is_ something that executors would care about (not drivers), because it 
affects how the scanner-consuming code should be written.

I think a scanner setter is fine, though I might reduce the visibility/audience 
a bit (i.e. not fully "stable" or whatever) since it's a bit esoteric.


http://gerrit.cloudera.org:8080/#/c/12715/4/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java
File 
java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java:

http://gerrit.cloudera.org:8080/#/c/12715/4/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java@416
PS4, Line 416: if the RowResults
             :    * will not be stored between calls to {@link 
RowResultIterator#next()).
I think this last part needs to state the limitations more clearly and more 
loudly. How about something like:

  This can be a useful optimization to reduce the number of objects created.

  Note: DO NOT use this if the RowResult is stored between calls to next(). 
Enabling this optimization
  means that a call to next() invalidates the previously returned RowResult; 
accessing it after next() (by e.g.
  storing all RowResults in a collection and accessing them later) will lead to 
<whatever bad stuff happens>


http://gerrit.cloudera.org:8080/#/c/12715/4/java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanner.java
File java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanner.java:

http://gerrit.cloudera.org:8080/#/c/12715/4/java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanner.java@49
PS4, Line 49: This can
            :    * be a useful optimization to reduce the number of objects 
created if the RowResults
            :    * will not be stored between calls to {@link 
RowResultIterator#next()).
See what I wrote in AsyncKuduScanner.


http://gerrit.cloudera.org:8080/#/c/12715/4/java/kudu-client/src/main/java/org/apache/kudu/client/RowResultIterator.java
File 
java/kudu-client/src/main/java/org/apache/kudu/client/RowResultIterator.java:

http://gerrit.cloudera.org:8080/#/c/12715/4/java/kudu-client/src/main/java/org/apache/kudu/client/RowResultIterator.java@71
PS4, Line 71:     this.reuseRowResult = reuseRowResult;
You don't actually need this.reuseRowResult; seems like you could get by with 
checking whether this.sharedRowResult is null or not.


http://gerrit.cloudera.org:8080/#/c/12715/4/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java
File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java:

http://gerrit.cloudera.org:8080/#/c/12715/4/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java@63
PS4, Line 63:     KuduSession session = client.newSession();
Isn't the default mode for a new session AUTO_FLUSH_SYNC? In which case you 
don't need the explicit flush on L71.


http://gerrit.cloudera.org:8080/#/c/12715/4/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java@73
PS4, Line 73:     // Ensure a java foreach works on the iterable scanner.
Maybe it'd be clearer as "Ensure that when an enhanced for-loop is used, 
there's no sharing of RowResult objects."


http://gerrit.cloudera.org:8080/#/c/12715/4/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java@82
PS4, Line 82:     // Create a scanner with the reuseRowResult optimization.
Then you can juxtapose this comment with the one above (that when 
reuseRowResult=true, RowResult objects are shared).


http://gerrit.cloudera.org:8080/#/c/12715/4/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java@102
PS4, Line 102:     String tableName = "testKeepAlive";
Any reason you can't reuse the class member tableName instead?


http://gerrit.cloudera.org:8080/#/c/12715/4/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java@105
PS4, Line 105:         new ColumnSchema.ColumnSchemaBuilder("val", 
Type.INT32).build());
Could remove this column; doesn't seem like it's relevant for the test.



--
To view, visit http://gerrit.cloudera.org:8080/12715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3e4ac59e30d0562c0a381d5e304af1dcfdcf5a1a
Gerrit-Change-Number: 12715
Gerrit-PatchSet: 4
Gerrit-Owner: Grant Henke <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Will Berkeley <[email protected]>
Gerrit-Comment-Date: Tue, 12 Mar 2019 21:07:47 +0000
Gerrit-HasComments: Yes

Reply via email to