cool. FYI, i'm at databricks today and talked w/patrick, josh and davies about this. we have some great ideas to actually make this happen and will be pushing over the next few weeks to get it done. :)
On Thu, Apr 2, 2015 at 9:21 AM, Nicholas Chammas <nicholas.cham...@gmail.com > wrote: > (Renaming thread so as to un-hijack Marcelo's request.) > > Sure, we definitely want tests running faster. > > Part of "testing all the things" will be factoring out stuff from the > various builds that can be run just once. > > We've also tried in the past (with little success) to parallelize test > execution <https://issues.apache.org/jira/browse/SPARK-3431>. That still > needs work before it becomes possible. > > Nick > > > On Thu, Apr 2, 2015 at 11:59 AM shane knapp <skn...@berkeley.edu> wrote: > >> i agree with all of this. but can we please break up the tests and make >> them shorter? :) >> >> On Thu, Apr 2, 2015 at 8:54 AM, Nicholas Chammas < >> nicholas.cham...@gmail.com> wrote: >> >>> This is secondary to Marcelo’s question, but I wanted to comment on this: >>> >>> Its main limitation is more cultural than technical: you need to get >>> people >>> to care about intermittent test runs, otherwise you can end up with >>> failures that nobody keeps on top of >>> >>> This is a problem that plagues Spark as well, but there *is* a technical >>> solution. >>> >>> The solution is simple: *All* the builds that we care about run for >>> *every* >>> proposed change. If *any* build fails, the change doesn’t make it into >>> the >> >> >>> repository. >>> >>> Spark already has a pull request builder that tests and reports back on >>> PRs. Committers don’t merge in PRs when this builder reports that it >>> failed >>> some tests. That’s a good thing. >>> >>> The problem is that there are several other builds that we run on a fixed >>> interval, independent of the pull request builder. These builds test >>> different configurations, dependency versions, and environments than what >>> the PR builder covers. If one of those builds fails, it fails on its own >>> little island, with no-one to hear it scream. The build failure is >>> detached >>> from the PR that caused it to fail. >>> >>> What should happen is that the whole matrix of stuff we care to test gets >>> run for every PR. No PR goes in if any build we care about fails for that >>> PR, and every build we care about runs for every commit of every PR. >>> >>> Really, this is just an extension of the basic idea of the PR builder. It >>> >> doesn’t make much sense to test stuff *after* it has been committed and >> >> >>> potentially broken things. And it becomes exponentially more difficult to >>> find and fix a problem the longer it has been festering in the repo. It’s >>> best to keep such problems out in the first place. >>> >>> With some more work on our CI infrastructure, I think this can be done. >>> Maybe even later this year. >>> >>> Nick >>> >> >>> On Thu, Apr 2, 2015 at 6:02 AM Steve Loughran ste...@hortonworks.com >>> >> <http://mailto:ste...@hortonworks.com> wrote: >>> >>> >>> > > On 2 Apr 2015, at 06:31, Patrick Wendell <pwend...@gmail.com> wrote: >>> > > >>> > > Hey Marcelo, >>> > > >>> > > Great question. Right now, some of the more active developers have an >>> > > account that allows them to log into this cluster to inspect logs (we >>> > > copy the logs from each run to a node on that cluster). The >>> > > infrastructure is maintained by the AMPLab. >>> > > >>> > > I will put you in touch the someone there who can get you an account. >>> > > >>> > > This is a short term solution. The longer term solution is to have >>> > > these scp'd regularly to an S3 bucket or somewhere people can get >>> > > access to them, but that's not ready yet. >>> > > >>> > > - Patrick >>> > > >>> > >> >>> > >>> > >>> > ASF Jenkins is always there to play with; committers/PMC members should >>> > just need to file a BUILD JIRA to get access. >>> > >>> > Its main limitation is more cultural than technical: you need to get >>> > people to care about intermittent test runs, otherwise you can end up >>> with >>> > failures that nobody keeps on top of >>> > https://builds.apache.org/view/H-L/view/Hadoop/ >>> > >>> > Someone really needs to own the "keep the builds working" problem -and >>> > have the ability to somehow kick others into fixing things. The latter >>> is >>> > pretty hard cross-organisation >>> > >>> > >>> > >> That would be really helpful to debug build failures. The scalatest >>> > >> output isn't all that helpful. >>> > >> >>> > >>> > Potentially an issue with the test runner, rather than the tests >>> > themselves. >>> > >>> > --------------------------------------------------------------------- >>> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>> > For additional commands, e-mail: dev-h...@spark.apache.org >>> > >>> > >>> >>