I fully support this. A smoothly running test infrastructure helps everybody’s work just flow better.
The Jenkins Pull Request Builder is mostly functioning again. However, we are working on a simpler technical pipeline for testing patches, as this plug-in has been a constant source of downtime and issues for us, and is very hard to debug. Yep. One such issue that happens too often is that Jenkins simply fails to fetch from git. Hopefully a new pipeline will be able to fetch more reliably. flaky tests Dunno if these were some of the ones recently fixed, but the flakiest tests seem to be the Kafka and Flume tests in Spark Streaming, based purely on my subjective experience. It would be great if we could stabilize them! Time of tests PSA: Here are some related JIRA issues for those interested in working on our testing setup: - SPARK-3431: Parallelize execution of tests <https://issues.apache.org/jira/browse/SPARK-3431> - SPARK-3432: Fix logging of unit test execution time <https://issues.apache.org/jira/browse/SPARK-3432> Nick On Sun, Sep 14, 2014 at 2:20 AM, Josh Rosen <rosenvi...@gmail.com> wrote: > Also, huge thanks to Cheng Lian, who tracked down and fixed the final > issue that was causing the Maven master build’s Spark SQL tests to fail! > > On September 13, 2014 at 11:08:00 PM, Patrick Wendell (pwend...@gmail.com) > wrote: > Hey All, > > Wanted to send a quick update about test infrastructure. With the > number of contributors we have and the rate of development, > maintaining a well-oiled test infra is really important. > > Every time a flaky test fails a legitimate pull request, it wastes > developer time and effort. > > 1. Master build: Spark's master builds are back to green again in > Maven and SBT after a long time of instability. Big thanks to Josh > Rosen, Andrew Or, Nick Chammas, Shane Knapp, Sean Owen, and many > others who were involved in pinpointing and fixing fairly convoluted > test failure issues. > > 2. Jenkins PRB: The Jenkins Pull Request Builder is mostly functioning > again. However, we are working on a simpler technical pipeline for > testing patches, as this plug-in has been a constant source of > downtime and issues for us, and is very hard to debug. > > 3. Reverting flaky patches: Going forward - we may revert patches that > seem to be the root cause of flaky or failing tests. This is necessary > as these days, the test infra being down will block something like > 10-30 in-flight patches on a given day. This puts the onus back on the > test writer to try and figure out what's going on - we'll of course > help debug the issue! > > 4. Time of tests: With hundreds (thousands?) of tests, we will have a > very high bar for tests which take several seconds or longer. Things > like Thread.sleep() bloat test time when proper synchronization > mechanisms should be used. Expect reviewers to push back on any > long-running tests, in many cases they can be re-written to be both > shorter and better. > > Thanks again to everyone putting in effort on this, we've made a ton > of progress in the last few weeks. A solid test infra will help us > scale and move quickly as Spark development continues to accelerate. > > - Patrick > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > >