It's being worked on. Turns out there are some modifications still needed
to the NexMark queries.

Reuven

On Thu, Sep 14, 2017 at 9:33 PM, Pei HE <[email protected]> wrote:

> Could any Googlers help to run NexMark on Dataflow streaming and share the
> numbers with the community?
> --
> Pei
>
> On Fri, Aug 25, 2017 at 11:28 PM, Lukasz Cwik <[email protected]>
> wrote:
>
> > Etienne, cut some JIRAs for improvements like ValidatesRunner for the
> > Nexmark suite that you think are worthy. Some of them might be good
> > 'starter' tasks as well.
> >
> > On Fri, Aug 25, 2017 at 1:43 AM, Etienne Chauchot <[email protected]>
> > wrote:
> >
> > > Hi guys,
> > >
> > > There is also some points to discuss:
> > >
> > > - I think some of the tests in this test suite should be generalized as
> > > validatesRunner tests like it was done for example for custom window
> > > merging (https://github.com/apache/beam/blob/5181e619f17e1f69fabe8d5
> > > bdfc7a3a6a2142cde/sdks/java/core/src/test/java/org/apache/
> > > beam/sdk/transforms/windowing/WindowTest.java#L591)
> > >
> > > - We have run almost no tests on Dataflow, so if someone could run the
> > > test suite on dataflow, he's very welcome. All needed information are
> > still
> > > in the README, but I'll move these info to the website.
> > >
> > > - other points?
> > >
> > > WDYT?
> > >
> > > Best,
> > >
> > > Etienne
> > >
> > >
> > >
> > > Le 24/08/2017 à 18:35, Lukasz Cwik a écrit :
> > >
> > >> Yeah, was looking forward to this.
> > >>
> > >> On Thu, Aug 24, 2017 at 9:20 AM, Tyler Akidau
> > <[email protected]
> > >> >
> > >> wrote:
> > >>
> > >> Awesome news, thank you! :-D
> > >>>
> > >>> On Thu, Aug 24, 2017 at 12:40 AM Etienne Chauchot <
> [email protected]
> > >
> > >>> wrote:
> > >>>
> > >>> Hi all,
> > >>>>
> > >>>> I wanted to let you know that the Nexmark PR is merged into master.
> > Feel
> > >>>> free to use it (e.g. performance testing, release testing ...).
> > >>>>
> > >>>> Etienne
> > >>>>
> > >>>> Le 12/05/2017 à 10:55, Etienne Chauchot a écrit :
> > >>>>
> > >>>>> Hi guys,
> > >>>>>
> > >>>>> I wanted to let you know that I have just submitted a PR around
> > >>>>> NexMark. This is a port of the NexMark queries to Beam, to be used
> as
> > >>>>> integration tests.
> > >>>>> This can also be used as A-B testing (no-regression or performance
> > >>>>> comparison between 2 versions of the same engine or of the same
> > runner)
> > >>>>>
> > >>>>> This a continuation of the previous PR (#99) from Mark Shields.
> > >>>>> The code has changed quite a bit: some queries have changed to use
> > new
> > >>>>> Beam APIs and there where some big refactorings. More important, we
> > >>>>> can now run all the queries in all the runners.
> > >>>>>
> > >>>>> Nevertheless, there are still some open issues in Nexmark
> > >>>>> (https://github.com/iemejia/beam/issues) and in Beam upstream (see
> > >>>>> issue links in https://issues.apache.org/jira/browse/BEAM-160)
> > >>>>>
> > >>>>> I wanted to submit the PR before our (Ismaël and I) NexMark talk at
> > >>>>> the ApacheCon. The PR is not perfect but it is in a good shape to
> > >>>>> share it.
> > >>>>>
> > >>>>> Best,
> > >>>>>
> > >>>>> Etienne
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> Le 22/03/2017 à 04:51, Kenneth Knowles a écrit :
> > >>>>>
> > >>>>>> This is great! Having a variety of realistic-ish pipelines running
> > on
> > >>>>>> all
> > >>>>>> runners complements the validation suite and IO IT work.
> > >>>>>>
> > >>>>>> If I recall, some of these involve heavy and esoteric uses of
> state,
> > >>>>>>
> > >>>>> so
> > >>>
> > >>>> definitely give me a ping if you hit any trouble.
> > >>>>>>
> > >>>>>> Kenn
> > >>>>>>
> > >>>>>> On Tue, Mar 21, 2017 at 9:38 AM, Etienne Chauchot <
> > >>>>>>
> > >>>>> [email protected]>
> > >>>
> > >>>> wrote:
> > >>>>>>
> > >>>>>> Hi all,
> > >>>>>>>
> > >>>>>>> Ismael and I are working on upgrading the Nexmark implementation
> > for
> > >>>>>>> Beam.
> > >>>>>>> See https://github.com/iemejia/beam/tree/BEAM-160-nexmark and
> > >>>>>>> https://issues.apache.org/jira/browse/BEAM-160. We are
> continuing
> > >>>>>>>
> > >>>>>> the
> > >>>
> > >>>> work done by Mark Shields. See https://github.com/apache/
> > >>>>>>>
> > >>>>>> beam/pull/366
> > >>>
> > >>>> for the original PR.
> > >>>>>>>
> > >>>>>>> The PR contains queries that have a wide coverage of the Beam
> model
> > >>>>>>>
> > >>>>>> and
> > >>>
> > >>>> that represent a realistic end user use case (some come from client
> > >>>>>>> experience on Google Cloud Dataflow).
> > >>>>>>>
> > >>>>>>> So far, we have upgraded the implementation to the latest Beam
> > >>>>>>> snapshot.
> > >>>>>>> And we are able to execute a good subset of the queries in the
> > >>>>>>> different
> > >>>>>>> runners. We upgraded the nexmark drivers to do so: direct driver
> > >>>>>>> (upgraded
> > >>>>>>> from inProcessDriver) and flink driver and we added a new one for
> > >>>>>>> spark.
> > >>>>>>>
> > >>>>>>> There is still a good amount of work to do and we would like to
> > know
> > >>>>>>>
> > >>>>>> if
> > >>>
> > >>>> you think that this contribution can have its place into Beam
> > >>>>>>> eventually.
> > >>>>>>>
> > >>>>>>> The interests of having Nexmark on Beam that we have seen so far
> > are:
> > >>>>>>>
> > >>>>>>> - Rich batch/streaming test
> > >>>>>>>
> > >>>>>>> - A-B testing of runners or runtimes (non-regression, performance
> > >>>>>>> comparison between versions ...)
> > >>>>>>>
> > >>>>>>> - Integration testing (sdk/runners, runner/runtime, ...)
> > >>>>>>>
> > >>>>>>> - Validate beam capability matrix
> > >>>>>>>
> > >>>>>>> - It can be used as part of the ongoing PerfKit work (if there is
> > any
> > >>>>>>> interest).
> > >>>>>>>
> > >>>>>>> As a final note, we are tracking the issues in the same repo. If
> > >>>>>>> someone
> > >>>>>>> is interested in contributing, or have more ideas, you are
> welcome
> > :)
> > >>>>>>>
> > >>>>>>> Etienne
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>
> > >
> >
>

Reply via email to