I dug around this a bit a while ago, I think if someone sat down and profiled the tests it's likely we could find some things to optimize. In particular, there may be overheads in starting up a local spark context that could be minimized and speed up all the tests. Also, there are some tests (especially in Streaming) that take really long, like 60 seconds for a single test (see some of the new flume tests). These could almost certainly be optimized.
I think 5 minutes might be out of reach, but something like a 2X improvement might be possible and would be very valuable if accomplished. - Patrick On Fri, Aug 8, 2014 at 11:24 AM, Matei Zaharia <matei.zaha...@gmail.com> wrote: > Just as a note, when you're developing stuff, you can use "test-only" in sbt, > or the equivalent feature in Maven, to run just some of the tests. This is > what I do, I don't wait for Jenkins to run things. 90% of the time if it > passes the tests that I know could break stuff, it will pass all of Jenkins. > > Jenkins should always be doing all the integration tests, so I don't think it > will become *that* much shorter in the long run, though it can certainly be > improved. > > Matei > > On August 8, 2014 at 10:20:35 AM, Nicolas Liochon (nkey...@gmail.com) wrote: > > fwiw, when we did this work in HBase, we categorized the tests. Then some > tests can share a single jvm, while some others need to be isolated in > their own jvm. Nevertheless surefire can still run them in parallel by > starting/stopping several jvm. > > Nicolas > > > On Fri, Aug 8, 2014 at 7:10 PM, Reynold Xin <r...@databricks.com> wrote: > >> ScalaTest actually has support for parallelization built-in. We can use >> that. >> >> The main challenge is to make sure all the test suites can work in parallel >> when running along side each other. >> >> >> On Fri, Aug 8, 2014 at 9:47 AM, Ted Yu <yuzhih...@gmail.com> wrote: >> >> > How about using parallel execution feature of maven-surefire-plugin >> > (assuming all the tests were made parallel friendly) ? >> > >> > >> > >> http://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html >> > >> > Cheers >> > >> > >> > On Fri, Aug 8, 2014 at 9:14 AM, Sean Owen <so...@cloudera.com> wrote: >> > >> > > A common approach is to separate unit tests from integration tests. >> > > Maven has support for this distinction. I'm not sure it helps a lot >> > > though, since it only helps you to not run integration tests all the >> > > time. But lots of Spark tests are integration-test-like and are >> > > important to run to know a change works. >> > > >> > > I haven't heard of a plugin to run different test suites remotely on >> > > many machines, but I would not be surprised if it exists. >> > > >> > > The Jenkins servers aren't CPU-bound as far as I can tell. It's that >> > > the tests spend a lot of time waiting for bits to start up or >> > > complete. That implies the existing tests could be sped up by just >> > > running in parallel locally. I recall someone recently proposed this? >> > > >> > > And I think the problem with that is simply that some of the tests >> > > collide with each other, by opening up the same port at the same time >> > > for example. I know that kind of problem is being attacked even right >> > > now. But if all the tests were made parallel friendly, I imagine >> > > parallelism could be enabled and speed up builds greatly without any >> > > remote machines. >> > > >> > > >> > > On Fri, Aug 8, 2014 at 5:01 PM, Nicholas Chammas >> > > <nicholas.cham...@gmail.com> wrote: >> > > > Howdy, >> > > > >> > > > Do we think it's both feasible and worthwhile to invest in getting >> our >> > > unit >> > > > tests to finish in under 5 minutes (or something similarly brief) >> when >> > > run >> > > > by Jenkins? >> > > > >> > > > Unit tests currently seem to take anywhere from 30 min to 2 hours. As >> > > > people add more tests, I imagine this time will only grow. I think it >> > > would >> > > > be better for both contributors and reviewers if they didn't have to >> > wait >> > > > so long for test results; PR reviews would be shorter, if nothing >> else. >> > > > >> > > > I don't know how how this is normally done, but maybe it wouldn't be >> > too >> > > > much work to get a test cycle to feel lighter. >> > > > >> > > > Most unit tests are independent and can be run concurrently, right? >> > Would >> > > > it make sense to build a given patch on many servers at once and send >> > > > disjoint sets of unit tests to each? >> > > > >> > > > I'd be interested in working on something like that if possible (and >> > > > sensible). >> > > > >> > > > Nick >> > > >> > > --------------------------------------------------------------------- >> > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> > > For additional commands, e-mail: dev-h...@spark.apache.org >> > > >> > > >> > >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org