Re: [DISCUSSION] using NexMark for Beam

Dan Halperin Tue, 21 Mar 2017 09:42:08 -0700

Not a deep response, but this is awesome! We'd really like to have some
good benchmarks, and I'm excited you're updating Nexmark. This will be
great!


On Tue, Mar 21, 2017 at 9:38 AM, Etienne Chauchot <[email protected]>
wrote:

> Hi all,
>
> Ismael and I are working on upgrading the Nexmark implementation for Beam.
> See https://github.com/iemejia/beam/tree/BEAM-160-nexmark and
> https://issues.apache.org/jira/browse/BEAM-160. We are continuing the
> work done by Mark Shields. See https://github.com/apache/beam/pull/366
> for the original PR.
>
> The PR contains queries that have a wide coverage of the Beam model and
> that represent a realistic end user use case (some come from client
> experience on Google Cloud Dataflow).
>
> So far, we have upgraded the implementation to the latest Beam snapshot.
> And we are able to execute a good subset of the queries in the different
> runners. We upgraded the nexmark drivers to do so: direct driver (upgraded
> from inProcessDriver) and flink driver and we added a new one for spark.
>
> There is still a good amount of work to do and we would like to know if
> you think that this contribution can have its place into Beam eventually.
>
> The interests of having Nexmark on Beam that we have seen so far are:
>
> - Rich batch/streaming test
>
> - A-B testing of runners or runtimes (non-regression, performance
> comparison between versions ...)
>
> - Integration testing (sdk/runners, runner/runtime, ...)
>
> - Validate beam capability matrix
>
> - It can be used as part of the ongoing PerfKit work (if there is any
> interest).
>
> As a final note, we are tracking the issues in the same repo. If someone
> is interested in contributing, or have more ideas, you are welcome :)
>
> Etienne
>
>

Re: [DISCUSSION] using NexMark for Beam

Reply via email to