This is great Jason! Let me know if / how I can assist with Spark, or generally.
Thanks, Amit On Thu, Dec 8, 2016 at 9:01 PM Jason Kuster <[email protected]> wrote: > Hey all, > > So as I mentioned on Stephen's IO Testing thread a few days ago I've been > doing a bunch of investigating into performance testing frameworks. I've > put all my thoughts into a doc here and I'd love to hear thoughts about my > investigation and what I'm proposing going forward. > > https://docs.google.com/document/d/18ffP1vYurvNe92Efs_ > 6hFFBDYC2dQEdWw135_GWZ2YU/view > > Copying from the earlier mail: > The tl;dr version is that there are a number of tools out there, but that > the best one I was able to find was a tool called PerfKit Benchmarker > (PKB)[1]. As it turns out, they already had the ability to benchmark Spark > (I have a PR out to extend the Spark functionality[2] and a couple more > improvements in the works), and I've put together some additional work in a > branch on my repository[3] to enable proof-of-concept Dataflow Java > benchmarks. I'm pretty excited about it overall. > > [1] https://github.com/GoogleCloudPlatform/PerfKitBenchmarker > [2] https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/pull/1214 > [3] https://github.com/jasonkuster/PerfKitBenchmarker/tree/beam > > Looking forward to moving forward with this. > > Jason > > -- > ------- > Jason Kuster > Apache Beam (Incubating) / Google Cloud Dataflow >
