Hi all,
As discussed here I proposed a PR (https://github.com/apache/beam/pull/4976) to
schedule nexmark runs as post commit
tests. the post commits run
* NM on direct runner in batch mode
* NM on direct runner in streaming mode
* NM on flink runner in batch mode
* NM on flink runner in streaming mode
* NM on spark runner in batch mode
these are the runners/modes for which all the nexmark queries run fine
there is still output like a database to be added to nexmark (it just outputs
to the console for now)
If it is too costly, we could schedule less.
Etienne
>
> So what next? Shall we schedule nexmark runs and add a Bigquery sink to
> nexmark output?
>
> Le lundi 12 mars 2018 à 10:30 +0100, Etienne Chauchot a écrit :
> > Thanks everyone for your comments and support.
> >
> > Le vendredi 09 mars 2018 à 21:28 +0000, Alan Myrvold a écrit :
> > > Great ideas. I want to see a daily signal for anything that could prevent
> > > a release from happening, and precommits
> > > that are fast and reliable for areas that are commonly broken by code
> > > changes.
> > >
> > > We are now running the java quickstarts daily on a cron schedule, using
> > > direct, dataflow, and local spark and
> > > flink in the beam_PostRelease_NightlySnapshot job, see
> > > https://github.com/apache/beam/blob/master/release/build.gr
> > > adle This should provide a good signal for the examples integration tests
> > > against these runners.
> > >
> > > As Kenn noted, the java_maveninstall also runs lots of tests. It would be
> > > good to be more clear and intentional
> > > about which tests run when, and to consider implementing additional
> > > "always up" environments for use by the tests.
> > >
> > > Having the nexmark smoke tests run regularly and stored in a database
> > > would really enhance our efforts, perhaps
> > > starting with directrunner for the performance tests.
> > Yes
> >
> > >
> > > What area would have the most immediate impact? Nexmark smoke tests?
> > Yes IMHO I think that Nexmark smoke tests would have a great return on
> > investment. By just scheduling some of them
> > (at first), we enable deep confidence in the runners on real user
> > pipelines. In the past Nexmark has allowed to
> > discover regressions in performance before a release and also to discover
> > some bugs in some runners. But, please
> > note that, for this last ability, Nexmark is limited currently: it only
> > detects failures if an exception is thrown,
> > there is no check of the correctness of the output PCollection because the
> > aim was performance tests and there is no
> > point adding a slow test for correctness. Nevertheless, if we store the
> > output size (as I suggested in this thread),
> > we can get a hint on a failure if the output size is different from the
> > last stored output sizes.
> >
> > Etienne
> >
> > >
> > >
> > >
> > >
> > > On Fri, Mar 9, 2018 at 12:57 PM Kenneth Knowles <[email protected]> wrote:
> > > > On Fri, Mar 9, 2018 at 3:08 AM Etienne Chauchot <[email protected]>
> > > > wrote:
> > > > > Hi guys,
> > > > >
> > > > > I was looking at the various jenkins jobs and I wanted to submit a
> > > > > proposition:
> > > > >
> > > > > - Validates runner tests: currently run at PostCommit for all the
> > > > > runners. I think it is the quickest way to
> > > > > see
> > > > > regressions. So keep it that way
> > > > We've also toyed with precommit for runners where it is fast.
> > > >
> > > > > - Integration tests: AFAIK we only run the ones in examples module
> > > > > and only on demand. What about running all
> > > > > the IT (in
> > > > > particular IO IT) as a cron job on a daily basis with direct runner?
> > > > > Please note that it will require some
> > > > > always up
> > > > > backend infrastructure.
> > > > I like this idea. We actually run more, but in postcommit. You can see
> > > > the goal here: https://github.com/apache/
> > > > beam/blob/master/.test-infra/jenkins/job_beam_PostCommit_Java_MavenInstall.groovy#L47
> > > >
> > > > There's no infrastructure set up that I see. It is only DirectRunner
> > > > and DataflowRunner currently, as they are
> > > > "always up". But so could be local Flink and Spark. Do the ITs spin up
> > > > local versions of what they are
> > > > connecting to?
> > > >
> > > > If we have adequate resources, I also think ValidatesRunner on a real
> > > > cluster would add value, once we have the
> > > > cluster set up / tear down or "always up".
> > > >
> > > > > - Performance tests: what about running Nexmark SMOKE test suite in
> > > > > batch and streaming modes with all the
> > > > > runners on a
> > > > > daily basis and store the running times in a RRD database (to see
> > > > > performance regressions)?
> > > > I like this idea, too. I think we could do DirectRunner (and probably
> > > > local Flink) as postcommit without being
> > > > too expensive.
> > > >
> > > > Kenn
> > > >
> > > >
> > > > > Please note that not all the
> > > > > queries run in all the runners in all the modes right now. Also, we
> > > > > have some streaming pipelines termination
> > > > > issues
> > > > > (see https://issues.apache.org/jira/browse/BEAM-2847)
> > > > >
> > > > > I know that Stephen Sisk use to work on these topics. I also talked
> > > > > to guys from Polidea. But As I understood,
> > > > > they
> > > > > launch mainly integration tests on Dataflow runner.
> > > > >
> > > > > WDYT?
> > > > >
> > > > > Etienne
> > > > >
> > > > >
> > > > >