The PR builder seems to be building against Hadoop 2.3. In the log for the most recent successful build ( https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32805/consoleFull ) I see:
========================================================================= Building Spark ========================================================================= [info] Compile with Hive 0.13.1 [info] Building Spark with these arguments: -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive -Phive-thriftserver ... ========================================================================= Running Spark unit tests ========================================================================= [info] Running Spark tests with these arguments: -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl test Is anyone testing individual pull requests against Hadoop 2.4 or 2.6 before the code is declared "clean"? Fred From: Ted Yu <yuzhih...@gmail.com> To: Andrew Or <and...@databricks.com> Cc: "dev@spark.apache.org" <dev@spark.apache.org> Date: 05/15/2015 09:29 AM Subject: Re: Recent Spark test failures Jenkins build against hadoop 2.4 has been unstable recently: https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-Master-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.4,label=centos/ I haven't found the test which hung / failed in recent Jenkins builds. But PR builder has several green builds lately: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/ Maybe PR builder doesn't build against hadoop 2.4 ? Cheers On Mon, May 11, 2015 at 1:11 PM, Ted Yu <yuzhih...@gmail.com> wrote: Makes sense. Having high determinism in these tests would make Jenkins build stable. On Mon, May 11, 2015 at 1:08 PM, Andrew Or <and...@databricks.com> wrote: Hi Ted, Yes, those two options can be useful, but in general I think the standard to set is that tests should never fail. It's actually the worst if tests fail sometimes but not others, because we can't reproduce them deterministically. Using -M and -A actually tolerates flaky tests to a certain extent, and I would prefer to instead increase the determinism in these tests. -Andrew 2015-05-08 17:56 GMT-07:00 Ted Yu <yuzhih...@gmail.com>: Andrew: Do you think the -M and -A options described here can be used in test runs ? http://scalatest.org/user_guide/using_the_runner Cheers On Wed, May 6, 2015 at 5:41 PM, Andrew Or <and...@databricks.com> wrote: Dear all, I'm sure you have all noticed that the Spark tests have been fairly unstable recently. I wanted to share a tool that I use to track which tests have been failing most often in order to prioritize fixing these flaky tests. Here is an output of the tool. This spreadsheet reports the top 10 failed tests this week (ending yesterday 5/5): https://docs.google.com/spreadsheets/d/1Iv_UDaTFGTMad1sOQ_s4ddWr6KD3PuFIHmTSzL7LSb4 It is produced by a small project: https://github.com/andrewor14/spark-test-failures I have been filing JIRAs on flaky tests based on this tool. Hopefully we can collectively stabilize the build a little more as we near the release for Spark 1.4. -Andrew