That's great Christopher. Thanks for working on those tests yesterday. I have now seen all ITs pass at least once, as those tests passed on run https://jenkins.revelc.net/job/Accumulo-1.8-ITs/33/testReport/
So I think that means it is time for an RC. Anyone feel otherwise? Assuming we are ready, I will be making the first RC tomorrow. On Tue, Aug 9, 2016 at 10:50 PM, Christopher <ctubb...@apache.org> wrote: > So close to full passing in 1.8. It seems there's still a straggler with > time-sensitivity issues (I hope that's all it is, anyway). I'm not actually > sure why Jenkins reports the single test failure as two, though: > https://jenkins.revelc.net/job/Accumulo-1.8-ITs/36/ > > > On Fri, Aug 5, 2016 at 3:51 PM Michael Wall <mjw...@gmail.com> wrote: > > > I have been able to get every IT to pass at least once except the > following > > > > ACCUMULO-4362 <https://issues.apache.org/jira/browse/ACCUMULO-4362> > > ACCUMULO-4397 <https://issues.apache.org/jira/browse/ACCUMULO-4397> > > > > These are moved back to release 1.8.0 and are blockers. > > > > On Fri, Aug 5, 2016 at 10:29 AM, Josh Elser <josh.el...@gmail.com> > wrote: > > > > > Sean Busbey wrote: > > > > > >> On Wed, Aug 3, 2016 at 5:17 PM, Christopher<ctubb...@apache.org> > > wrote: > > >> > > >>> On Wed, Aug 3, 2016 at 5:47 PM Sean Busbey<bus...@cloudera.com> > > wrote: > > >>> > > >>> My understanding was that maintenance releases (aka double dot, e.g. > > >>>> 1.7.2) had relaxed criteria because we expected the scope of changes > > >>>> in them to be more limited. Even so, the release notes for 1.7.2, > > >>>> 1.7.1, and 1.7.0 all claim the ITs passed. > > >>>> > > >>>> > > >>>> Even those releases have periodic IT failure. > > >>> > > >>> > > >>> Is there a reason we can't parallelize the ITs? > > >>>> > > >>> > > >>> We can. Eric's mrit effort was all intended towards that. But, that's > > not > > >>> the same as CI passing. I don't know what it would take to > parallelize > > >>> them > > >>> in a CI server. > > >>> > > >>> > > >>> What's stopping > > >>>> builds.a.o from running them? Specific requests from projects to asf > > >>>> infra can get us resources if that's the problem. > > >>>> > > >>>> > > >>>> I spoke to infra in HipChat about this a a few weeks ago, and > > mentioned > > >>> a > > >>> few things which impact builds on ASF jenkins (builds.apache.org): > > >>> > > >>> 1. Accumulo has an excessive number of tests to run. > > >>> 2. Build timeouts with Jenkins can abort builds. > > >>> 3. Tests are timing sensitive, and are affected by VM/host > > configuration > > >>> and contention with other concurrent builds from other projects. > > >>> 4. Tests need lots of RAM and storage (at least 4GB RAM, but ideally > no > > >>> less than 16GB, and at least 6 GB for a workspace) > > >>> 5. Tests need specialized system configuration, (increasing ulimits, > > >>> optimizing kernel settings for swappiness, etc.) > > >>> > > >>> What we really need for reliable IT passing in CI, is exclusive use > of > > >>> dedicated, bare-metal beefy build machines, for 6+ hours per build x > 4 > > >>> branches minimum, plus another 6+ hours for each pull request and > other > > >>> builds which skipITs, so we can get immediate feedback on unit tests > > and > > >>> compilation errors. > > >>> > > >>> > > >> I took a first pass at a nightly (~once per 12 hours) job on asf build > > for > > >> master and it did okay, considering that I haven't spent any time > trying > > >> to > > >> tune anything: > > >> > > >> https://builds.apache.org/job/Accumulo-master-IT/1/ > > >> > > >> 2 hr 9 min, 7 failures out of 202 tests. > > >> > > >> I think we can do this; if anyone else is interested I'll start a new > > >> thread > > >> where we can discuss. > > >> > > > > > > +1 it would be great to do this on ASF infra. > > > > > >