It's being worked on. Turns out there are some modifications still needed to the NexMark queries.
Reuven On Thu, Sep 14, 2017 at 9:33 PM, Pei HE <[email protected]> wrote: > Could any Googlers help to run NexMark on Dataflow streaming and share the > numbers with the community? > -- > Pei > > On Fri, Aug 25, 2017 at 11:28 PM, Lukasz Cwik <[email protected]> > wrote: > > > Etienne, cut some JIRAs for improvements like ValidatesRunner for the > > Nexmark suite that you think are worthy. Some of them might be good > > 'starter' tasks as well. > > > > On Fri, Aug 25, 2017 at 1:43 AM, Etienne Chauchot <[email protected]> > > wrote: > > > > > Hi guys, > > > > > > There is also some points to discuss: > > > > > > - I think some of the tests in this test suite should be generalized as > > > validatesRunner tests like it was done for example for custom window > > > merging (https://github.com/apache/beam/blob/5181e619f17e1f69fabe8d5 > > > bdfc7a3a6a2142cde/sdks/java/core/src/test/java/org/apache/ > > > beam/sdk/transforms/windowing/WindowTest.java#L591) > > > > > > - We have run almost no tests on Dataflow, so if someone could run the > > > test suite on dataflow, he's very welcome. All needed information are > > still > > > in the README, but I'll move these info to the website. > > > > > > - other points? > > > > > > WDYT? > > > > > > Best, > > > > > > Etienne > > > > > > > > > > > > Le 24/08/2017 à 18:35, Lukasz Cwik a écrit : > > > > > >> Yeah, was looking forward to this. > > >> > > >> On Thu, Aug 24, 2017 at 9:20 AM, Tyler Akidau > > <[email protected] > > >> > > > >> wrote: > > >> > > >> Awesome news, thank you! :-D > > >>> > > >>> On Thu, Aug 24, 2017 at 12:40 AM Etienne Chauchot < > [email protected] > > > > > >>> wrote: > > >>> > > >>> Hi all, > > >>>> > > >>>> I wanted to let you know that the Nexmark PR is merged into master. > > Feel > > >>>> free to use it (e.g. performance testing, release testing ...). > > >>>> > > >>>> Etienne > > >>>> > > >>>> Le 12/05/2017 à 10:55, Etienne Chauchot a écrit : > > >>>> > > >>>>> Hi guys, > > >>>>> > > >>>>> I wanted to let you know that I have just submitted a PR around > > >>>>> NexMark. This is a port of the NexMark queries to Beam, to be used > as > > >>>>> integration tests. > > >>>>> This can also be used as A-B testing (no-regression or performance > > >>>>> comparison between 2 versions of the same engine or of the same > > runner) > > >>>>> > > >>>>> This a continuation of the previous PR (#99) from Mark Shields. > > >>>>> The code has changed quite a bit: some queries have changed to use > > new > > >>>>> Beam APIs and there where some big refactorings. More important, we > > >>>>> can now run all the queries in all the runners. > > >>>>> > > >>>>> Nevertheless, there are still some open issues in Nexmark > > >>>>> (https://github.com/iemejia/beam/issues) and in Beam upstream (see > > >>>>> issue links in https://issues.apache.org/jira/browse/BEAM-160) > > >>>>> > > >>>>> I wanted to submit the PR before our (Ismaël and I) NexMark talk at > > >>>>> the ApacheCon. The PR is not perfect but it is in a good shape to > > >>>>> share it. > > >>>>> > > >>>>> Best, > > >>>>> > > >>>>> Etienne > > >>>>> > > >>>>> > > >>>>> > > >>>>> Le 22/03/2017 à 04:51, Kenneth Knowles a écrit : > > >>>>> > > >>>>>> This is great! Having a variety of realistic-ish pipelines running > > on > > >>>>>> all > > >>>>>> runners complements the validation suite and IO IT work. > > >>>>>> > > >>>>>> If I recall, some of these involve heavy and esoteric uses of > state, > > >>>>>> > > >>>>> so > > >>> > > >>>> definitely give me a ping if you hit any trouble. > > >>>>>> > > >>>>>> Kenn > > >>>>>> > > >>>>>> On Tue, Mar 21, 2017 at 9:38 AM, Etienne Chauchot < > > >>>>>> > > >>>>> [email protected]> > > >>> > > >>>> wrote: > > >>>>>> > > >>>>>> Hi all, > > >>>>>>> > > >>>>>>> Ismael and I are working on upgrading the Nexmark implementation > > for > > >>>>>>> Beam. > > >>>>>>> See https://github.com/iemejia/beam/tree/BEAM-160-nexmark and > > >>>>>>> https://issues.apache.org/jira/browse/BEAM-160. We are > continuing > > >>>>>>> > > >>>>>> the > > >>> > > >>>> work done by Mark Shields. See https://github.com/apache/ > > >>>>>>> > > >>>>>> beam/pull/366 > > >>> > > >>>> for the original PR. > > >>>>>>> > > >>>>>>> The PR contains queries that have a wide coverage of the Beam > model > > >>>>>>> > > >>>>>> and > > >>> > > >>>> that represent a realistic end user use case (some come from client > > >>>>>>> experience on Google Cloud Dataflow). > > >>>>>>> > > >>>>>>> So far, we have upgraded the implementation to the latest Beam > > >>>>>>> snapshot. > > >>>>>>> And we are able to execute a good subset of the queries in the > > >>>>>>> different > > >>>>>>> runners. We upgraded the nexmark drivers to do so: direct driver > > >>>>>>> (upgraded > > >>>>>>> from inProcessDriver) and flink driver and we added a new one for > > >>>>>>> spark. > > >>>>>>> > > >>>>>>> There is still a good amount of work to do and we would like to > > know > > >>>>>>> > > >>>>>> if > > >>> > > >>>> you think that this contribution can have its place into Beam > > >>>>>>> eventually. > > >>>>>>> > > >>>>>>> The interests of having Nexmark on Beam that we have seen so far > > are: > > >>>>>>> > > >>>>>>> - Rich batch/streaming test > > >>>>>>> > > >>>>>>> - A-B testing of runners or runtimes (non-regression, performance > > >>>>>>> comparison between versions ...) > > >>>>>>> > > >>>>>>> - Integration testing (sdk/runners, runner/runtime, ...) > > >>>>>>> > > >>>>>>> - Validate beam capability matrix > > >>>>>>> > > >>>>>>> - It can be used as part of the ongoing PerfKit work (if there is > > any > > >>>>>>> interest). > > >>>>>>> > > >>>>>>> As a final note, we are tracking the issues in the same repo. If > > >>>>>>> someone > > >>>>>>> is interested in contributing, or have more ideas, you are > welcome > > :) > > >>>>>>> > > >>>>>>> Etienne > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>> > > > > > >
