Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12261 )

Change subject: [spark] Add write duration histograms
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12261/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12261/1//COMMIT_MSG@14
PS1, Line 14: 25.0%: 14ms, 25.0%: 14ms
Why does it have information on every bin duplicated in the output?  Is it 
intended?


http://gerrit.cloudera.org:8080/#/c/12261/1//COMMIT_MSG@21
PS1, Line 21: need to be shipped between executors and the driver, so
            : their (serialized) size is relevant
How often does that happen?  Does it depend on the granularity of the histogram 
or something else?


http://gerrit.cloudera.org:8080/#/c/12261/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/HdrHistogramAccumulator.scala
File 
java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/HdrHistogramAccumulator.scala:

http://gerrit.cloudera.org:8080/#/c/12261/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/HdrHistogramAccumulator.scala@66
PS1, Line 66:   override def value: HistogramWrapper = histogram
> I looked into this. Yes, it's possible to subclass SynchronizedHistogram an
Thank you for clarifying on this!

In that case, the original approach looks good enough to me.



-- 
To view, visit http://gerrit.cloudera.org:8080/12261
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fd4d380b08bd7d7d5c1e65b79cffb44a9b9d433
Gerrit-Change-Number: 12261
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley <wdberke...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-Comment-Date: Tue, 29 Jan 2019 19:05:48 +0000
Gerrit-HasComments: Yes

Reply via email to