Hey all,

So as I mentioned on Stephen's IO Testing thread a few days ago I've been
doing a bunch of investigating into performance testing frameworks. I've
put all my thoughts into a doc here and I'd love to hear thoughts about my
investigation and what I'm proposing going forward.

https://docs.google.com/document/d/18ffP1vYurvNe92Efs_
6hFFBDYC2dQEdWw135_GWZ2YU/view

Copying from the earlier mail:
The tl;dr version is that there are a number of tools out there, but that
the best one I was able to find was a tool called PerfKit Benchmarker
(PKB)[1]. As it turns out, they already had the ability to benchmark Spark
(I have a PR out to extend the Spark functionality[2] and a couple more
improvements in the works), and I've put together some additional work in a
branch on my repository[3] to enable proof-of-concept Dataflow Java
benchmarks. I'm pretty excited about it overall.

[1] https://github.com/GoogleCloudPlatform/PerfKitBenchmarker
[2] https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/pull/1214
[3] https://github.com/jasonkuster/PerfKitBenchmarker/tree/beam

Looking forward to moving forward with this.

Jason

-- 
-------
Jason Kuster
Apache Beam (Incubating) / Google Cloud Dataflow

Reply via email to