I chatted with Patrick briefly offline. It would be interesting to know whether the scripts have some way of saying "run a smaller version of certain tests" (e.g. by setting a system property that the tests look at to decide what to run). That way, if there are no changes under sql/, we could still run a small part of HiveCompatibilitySuite, just not all of it. The reasoning being that if a core change breaks something in Hive, it will probably break many tests, not a specific one.
On Tue, Aug 25, 2015 at 1:48 PM, Michael Armbrust <mich...@databricks.com> wrote: > I'd be okay skipping the HiveCompatibilitySuite for core-only changes. They > do often catch bugs in changes to catalyst or sql though. Same for > HashJoinCompatibilitySuite/VersionsSuite. > > HiveSparkSubmitSuite/CliSuite should probably stay, as they do test things > like addJar that have been broken by core in the past. > > On Tue, Aug 25, 2015 at 1:40 PM, Patrick Wendell <pwend...@gmail.com> wrote: >> >> There is already code in place that restricts which tests run >> depending on which code is modified. However, changes inside of >> Spark's core currently require running all dependent tests. If you >> have some ideas about how to improve that heuristic, it would be >> great. >> >> - Patrick >> >> On Tue, Aug 25, 2015 at 1:33 PM, Marcelo Vanzin <van...@cloudera.com> >> wrote: >> > Hello y'all, >> > >> > So I've been getting kinda annoyed with how many PR tests have been >> > timing out. I took one of the logs from one of my PRs and started to >> > do some crunching on the data from the output, and here's a list of >> > the 5 slowest suites: >> > >> > 307.14s HiveSparkSubmitSuite >> > 382.641s VersionsSuite >> > 398s CliSuite >> > 410.52s HashJoinCompatibilitySuite >> > 2508.61s HiveCompatibilitySuite >> > >> > Looking at those, I'm not surprised at all that we see so many >> > timeouts. Is there any ongoing effort to trim down those tests >> > (especially HiveCompatibilitySuite) or somehow restrict when they're >> > run? >> > >> > Almost 1 hour to run a single test suite that affects a rather >> > isolated part of the code base looks a little excessive to me. >> > >> > -- >> > Marcelo >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> > For additional commands, e-mail: dev-h...@spark.apache.org >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> For additional commands, e-mail: dev-h...@spark.apache.org >> > -- Marcelo --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org