Yao Xu has uploaded a new patch set (#3) to the change originally created by yangz. ( http://gerrit.cloudera.org:8080/12323 )
Change subject: KUDU-2670: Part 1: Build ScanToken by splitSizeBytes ...................................................................... KUDU-2670: Part 1: Build ScanToken by splitSizeBytes When reading data in a kudu table using spark, if there is a large amount of data in the tablet, reading the data takes a long time. The reason is that KuduRDD uses a tablet to generate the scanToken, so a spark task needs to process all the data in a tablet. We send SplitKeyRange RPC to TServer, split tablet into multiple primary key ranges by size. And generate the scanToken by tablet's key ranges. Change-Id: I0502f5d64569e8b1d45e88de3cb36aa2e01234d0 --- M java/kudu-client/src/main/java/org/apache/kudu/client/AbstractKuduScannerBuilder.java M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java A java/kudu-client/src/main/java/org/apache/kudu/client/KeyRange.java M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java A java/kudu-client/src/main/java/org/apache/kudu/client/SplitKeyRangeRequest.java A java/kudu-client/src/main/java/org/apache/kudu/client/SplitKeyRangeResponse.java A java/kudu-client/src/test/java/org/apache/kudu/client/TestSplitKeyRangeRequest.java A java/kudu-client/src/test/java/org/apache/kudu/util/SplitTestUtil.java M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduReadOptions.scala 11 files changed, 567 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/23/12323/3 -- To view, visit http://gerrit.cloudera.org:8080/12323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0502f5d64569e8b1d45e88de3cb36aa2e01234d0 Gerrit-Change-Number: 12323 Gerrit-PatchSet: 3 Gerrit-Owner: yangz <zhe...@gmail.com> Gerrit-Reviewer: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yao Xu <ocla...@gmail.com> Gerrit-Reviewer: yangz <zhe...@gmail.com>