Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/12251 )
Change subject: [spark] Add metrics to kudu-spark ...................................................................... Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/12251/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala: http://gerrit.cloudera.org:8080/#/c/12251/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@69 PS1, Line 69: KA > A Done http://gerrit.cloudera.org:8080/#/c/12251/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala: http://gerrit.cloudera.org:8080/#/c/12251/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala@130 PS1, Line 130: val rowsRead: LongAccumulator > doc? Done http://gerrit.cloudera.org:8080/#/c/12251/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala@156 PS1, Line 156: rowsRead.add(currentIterator.getNumRows) > This is a bit tricky. The rows have been read from Kudu but not necessarily The reason it is done here is because this is about performance measurement (how many rows did executor 1 read v. executor 2, and in how long?), as opposed to checking correctness (I don't see all 1000 rows that I expect...how many did each executor read?). I think for the former it is better to measure what is read from Kudu regardless of whether the rows are consumed by the Spark application. -- To view, visit http://gerrit.cloudera.org:8080/12251 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ied81c890b3d4510f767d8f023963ff878f398140 Gerrit-Change-Number: 12251 Gerrit-PatchSet: 1 Gerrit-Owner: Will Berkeley <wdberke...@gmail.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com> Gerrit-Comment-Date: Wed, 23 Jan 2019 19:57:47 +0000 Gerrit-HasComments: Yes