Re: [PROPOSITION] schedule some sanity tests on a daily basis

Etienne Chauchot Thu, 29 Mar 2018 08:41:29 -0700
Hi all,
As discussed here I proposed a PR (https://github.com/apache/beam/pull/4976) to 
schedule nexmark runs as post commit
tests. the post commits run
 * NM on direct runner in batch mode
 * NM on direct runner in streaming mode
 * NM on flink runner in batch mode
 * NM on flink runner in streaming mode
 * NM on spark runner in batch mode
these are the runners/modes for which all the nexmark queries run fine
there is still output like a database to be added to nexmark (it just outputs 
to the console for now)
If it is too costly, we could schedule less.
Etienne
> 
> So what next? Shall we schedule nexmark runs and add a Bigquery sink to 
> nexmark output?
> 
> Le lundi 12 mars 2018 à 10:30 +0100, Etienne Chauchot a écrit :
> > Thanks everyone for your comments and support.
> > 
> > Le vendredi 09 mars 2018 à 21:28 +0000, Alan Myrvold a écrit :
> > > Great ideas. I want to see a daily signal for anything that could prevent 
> > > a release from happening, and precommits
> > > that are fast and reliable for areas that are commonly broken by code 
> > > changes.
> > > 
> > > We are now running the java quickstarts daily on a cron schedule, using 
> > > direct, dataflow, and local spark and
> > > flink in the beam_PostRelease_NightlySnapshot job, see 
> > > https://github.com/apache/beam/blob/master/release/build.gr
> > > adle This should provide a good signal for the examples integration tests 
> > > against these runners.
> > > 
> > > As Kenn noted, the java_maveninstall also runs lots of tests. It would be 
> > > good to be more clear and intentional
> > > about which tests run when, and to consider implementing additional 
> > > "always up" environments for use by the tests.
> > > 
> > > Having the nexmark smoke tests run regularly and stored in a database 
> > > would really enhance our efforts, perhaps
> > > starting with directrunner for the performance tests.
> > Yes
> > 
> > > 
> > > What area would have the most immediate impact? Nexmark smoke tests?
> > Yes IMHO I think that Nexmark smoke tests would have a great return on 
> > investment. By just scheduling some of them
> > (at first),  we enable deep confidence in the runners on real user 
> > pipelines. In the past Nexmark has allowed to
> > discover regressions in performance before a release and also to discover 
> > some bugs in some runners. But, please
> > note that, for this last ability, Nexmark is limited currently: it only 
> > detects failures if an exception is thrown,
> > there is no check of the correctness of the output PCollection because the 
> > aim was performance tests and there is no
> > point adding a slow test for correctness. Nevertheless, if we store the 
> > output size (as I suggested in this thread),
> > we can get a hint on a failure if the output size is different from the 
> > last stored output sizes.
> > 
> > Etienne
> > 
> > > 
> > > 
> > > 
> > > 
> > > On Fri, Mar 9, 2018 at 12:57 PM Kenneth Knowles <[email protected]> wrote:
> > > > On Fri, Mar 9, 2018 at 3:08 AM Etienne Chauchot <[email protected]> 
> > > > wrote:
> > > > > Hi guys,
> > > > > 
> > > > > I was looking at the various jenkins jobs and I wanted to submit a 
> > > > > proposition:
> > > > > 
> > > > > - Validates runner tests: currently run at PostCommit for all the 
> > > > > runners. I think it is the quickest way to
> > > > > see
> > > > > regressions. So keep it that way
> > > > We've also toyed with precommit for runners where it is fast.
> > > >  
> > > > > - Integration tests: AFAIK we only run the ones in examples module 
> > > > > and only on demand. What about running all
> > > > > the IT (in
> > > > > particular IO IT) as a cron job on a daily basis with direct runner? 
> > > > > Please note that it will require some
> > > > > always up
> > > > > backend infrastructure.
> > > > I like this idea. We actually run more, but in postcommit. You can see 
> > > > the goal here: https://github.com/apache/
> > > > beam/blob/master/.test-infra/jenkins/job_beam_PostCommit_Java_MavenInstall.groovy#L47
> > > > 
> > > > There's no infrastructure set up that I see. It is only DirectRunner 
> > > > and DataflowRunner currently, as they are
> > > > "always up". But so could be local Flink and Spark. Do the ITs spin up 
> > > > local versions of what they are
> > > > connecting to?
> > > > 
> > > > If we have adequate resources, I also think ValidatesRunner on a real 
> > > > cluster would add value, once we have the
> > > > cluster set up / tear down or "always up".
> > > >  
> > > > > - Performance tests: what about running Nexmark SMOKE test suite in 
> > > > > batch and streaming modes with all the
> > > > > runners on a
> > > > > daily basis and store the running times in a RRD database (to see 
> > > > > performance regressions)?
> > > > I like this idea, too. I think we could do DirectRunner (and probably 
> > > > local Flink) as postcommit without being
> > > > too expensive.
> > > > 
> > > > Kenn
> > > > 
> > > >  
> > > > > Please note that not all the
> > > > > queries run in all the runners in all the modes right now. Also, we 
> > > > > have some streaming pipelines termination
> > > > > issues
> > > > > (see https://issues.apache.org/jira/browse/BEAM-2847)
> > > > > 
> > > > > I know that Stephen Sisk use to work on these topics. I also talked 
> > > > > to guys from Polidea. But As I understood,
> > > > > they
> > > > > launch mainly integration tests on Dataflow runner.
> > > > > 
> > > > > WDYT?
> > > > > 
> > > > > Etienne
> > > > > 
> > > > > 
> > > > >
Re: [PROPOSITION] schedule some sanity tests on a daily basis

Reply via email to