Sean Busbey wrote:
On Wed, Aug 3, 2016 at 5:17 PM, Christopher<ctubb...@apache.org>  wrote:
On Wed, Aug 3, 2016 at 5:47 PM Sean Busbey<bus...@cloudera.com>  wrote:

My understanding was that maintenance releases (aka double dot, e.g.
1.7.2) had relaxed criteria because we expected the scope of changes
in them to be more limited. Even so, the release notes for 1.7.2,
1.7.1, and 1.7.0 all claim the ITs passed.


Even those releases have periodic IT failure.


Is there a reason we can't parallelize the ITs?

We can. Eric's mrit effort was all intended towards that. But, that's not
the same as CI passing. I don't know what it would take to parallelize them
in a CI server.


What's stopping
builds.a.o from running them? Specific requests from projects to asf
infra can get us resources if that's the problem.


I spoke to infra in HipChat about this a a few weeks ago, and mentioned a
few things which impact builds on ASF jenkins (builds.apache.org):

1. Accumulo has an excessive number of tests to run.
2. Build timeouts with Jenkins can abort builds.
3. Tests are timing sensitive, and are affected by VM/host configuration
and contention with other concurrent builds from other projects.
4. Tests need lots of RAM and storage (at least 4GB RAM, but ideally no
less than 16GB, and at least 6 GB for a workspace)
5. Tests need specialized system configuration, (increasing ulimits,
optimizing kernel settings for swappiness, etc.)

What we really need for reliable IT passing in CI, is exclusive use of
dedicated, bare-metal beefy build machines, for 6+ hours per build x 4
branches minimum, plus another 6+ hours for each pull request and other
builds which skipITs, so we can get immediate feedback on unit tests and
compilation errors.


I took a first pass at a nightly (~once per 12 hours) job on asf build for
master and it did okay, considering that I haven't spent any time trying to
tune anything:

https://builds.apache.org/job/Accumulo-master-IT/1/

2 hr 9 min, 7 failures out of 202 tests.

I think we can do this; if anyone else is interested I'll start a new thread
where we can discuss.

+1 it would be great to do this on ASF infra.

Reply via email to