So 2 hours is a hard cap on how long a build can run. Okie doke. Perhaps then I'll wrap the run-tests step as you suggest and limit it to 100 minutes or something, and cleanly report if it times out.
Sound good? On Fri, Aug 15, 2014 at 4:43 PM, Patrick Wendell <pwend...@gmail.com> wrote: > Hey Nicholas, > > Yeah so Jenkins has it's own timeout mechanism and it will just kill the > entire build after 120 minutes. But since run-tests is sitting in the > middle of the tests, it can't actually post a failure message. > > I think run-tests-jenkins should just wrap the call to run-tests in a call > in its own timeout. It might be possible to just use this: > > http://linux.die.net/man/1/timeout > > - Patrick > > > On Fri, Aug 15, 2014 at 1:31 PM, Nicholas Chammas < > nicholas.cham...@gmail.com> wrote: > >> OK, I've captured this in SPARK-3076 >> <https://issues.apache.org/jira/browse/SPARK-3076>. >> >> Patrick, >> >> Is the problem that this run-tests >> <https://github.com/apache/spark/blob/0afe5cb65a195d2f14e8dfcefdbec5dac023651f/dev/run-tests-jenkins#L151> >> step >> times out, and that is currently not handled gracefully? To be more >> specific, it hangs for 120 minutes, times out, but the parent script for >> some reason is also terminated. Does that sound right? >> >> Nick >> >> >> On Fri, Aug 15, 2014 at 3:33 PM, Shivaram Venkataraman < >> shiva...@eecs.berkeley.edu> wrote: >> >>> Jenkins runs for this PR https://github.com/apache/spark/pull/1960 >>> timed out without notification. The relevant Jenkins logs are at >>> >>> >>> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18588/consoleFull >>> >>> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18592/consoleFull >>> >>> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18597/consoleFull >>> >>> >>> On Fri, Aug 15, 2014 at 11:44 AM, Nicholas Chammas < >>> nicholas.cham...@gmail.com> wrote: >>> >>>> Shivaram, >>>> >>>> Can you point us to an example of that happening? The Jenkins console >>>> output, that is. >>>> >>>> Nick >>>> >>>> >>>> On Fri, Aug 15, 2014 at 2:28 PM, Shivaram Venkataraman < >>>> shiva...@eecs.berkeley.edu> wrote: >>>> >>>>> Also I think Jenkins doesn't post build timeouts to github. Is there >>>>> anyway >>>>> we can fix that ? >>>>> On Aug 15, 2014 9:04 AM, "Patrick Wendell" <pwend...@gmail.com> wrote: >>>>> >>>>> > Hi All, >>>>> > >>>>> > I noticed that all PR tests run overnight had failed due to >>>>> timeouts. The >>>>> > patch that updates the netty shuffle I believe somehow inflated to >>>>> the >>>>> > build time significantly. That patch had been tested, but one change >>>>> was >>>>> > made before it was merged that was not tested. >>>>> > >>>>> > I've reverted the patch for now to see if it brings the build times >>>>> back >>>>> > down. >>>>> > >>>>> > - Patrick >>>>> > >>>>> >>>> >>>> >>> >> >