Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12251 )

Change subject: [spark] Add metrics to kudu-spark
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12251/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
File 
java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala:

http://gerrit.cloudera.org:8080/#/c/12251/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@69
PS1, Line 69: KA
> A
Done


http://gerrit.cloudera.org:8080/#/c/12251/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala:

http://gerrit.cloudera.org:8080/#/c/12251/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala@130
PS1, Line 130: val rowsRead: LongAccumulator
> doc?
Done


http://gerrit.cloudera.org:8080/#/c/12251/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala@156
PS1, Line 156:       rowsRead.add(currentIterator.getNumRows)
> This is a bit tricky. The rows have been read from Kudu but not necessarily
The reason it is done here is because this is about performance measurement 
(how many rows did executor 1 read v. executor 2, and in how long?), as opposed 
to checking correctness (I don't see all 1000 rows that I expect...how many did 
each executor read?). I think for the former it is better to measure what is 
read from Kudu regardless of whether the rows are consumed by the Spark 
application.



--
To view, visit http://gerrit.cloudera.org:8080/12251
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ied81c890b3d4510f767d8f023963ff878f398140
Gerrit-Change-Number: 12251
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley <wdberke...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-Comment-Date: Wed, 23 Jan 2019 19:57:47 +0000
Gerrit-HasComments: Yes

Reply via email to