Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

Stack Thu, 03 Dec 2015 09:27:55 -0800

Notice: I'm messing with test-patch.sh reporting trying to improve the
zombie section. I'll likely break things for a while (I already have -- the
hadoopqa report section is curtailed at mo). Will flag when done.
St.Ack


On Wed, Dec 2, 2015 at 1:22 PM, Stack <st...@duboce.net> wrote:

> As part of my continuing advocacy of builds.apache.org and that their
> results are now worthy of our trust and nurture, here are some highlights
> from the last few days of builds:
>
> + hadoopqa is now finding zombies before the patch is committed.
> HBASE-14888 showed "-1 core tests. The patch failed these unit tests:" but
> didn't have any failed tests listed (I'm trying to see if I can do anything
> about this...). Running our little ./dev-tools/findHangingTests.py against
> the consoleText, it showed a hanging test. Running locally, I see same
> hang. This is before the patch landed.
> + Our branch runs are now near totally zombie and flakey free -- still
> some work to do -- but a recent patch that seemed harmless was causing a
> reliable flake fail in the backport to branch-1* confirmed by local runs.
> The flakeyness was plain to see up in builds.apache.org.
> + In the last few days I've committed a patch that included javadoc
> warnings even though hadoopqa said the patch introduced javadoc issues (I
> missed it). This messed up life for folks subsequently as their patches now
> reported javadoc issues....
>
> In short, I suggest that builds.apache.org is worth keeping an eye on,
> make sure you get a clean build out of hadoopqa before committing anything,
> and lets all work together to try and keep our builds blue: it'll save us
> all work in the long run.
>
> St.Ack
>
>
> On Tue, Nov 4, 2014 at 9:38 AM, Stack <st...@duboce.net> wrote:
>
>> Branch-1 and master have stabilized and now run mostly blue (give or take
>> the odd failure) [1][2]. Having a mostly blue branch-1 has helped us
>> identify at least one destabilizing commit in the last few days, maybe two;
>> this is as it should be (smile).
>>
>> Lets keep our builds blue. If you commit a patch, make sure subsequent
>> builds stay blue. You can subscribe to bui...@hbase.apache.org to get
>> notice of failures if not already subscribed.
>>
>> Thanks,
>> St.Ack
>>
>> 1. https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/
>> 2. https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/
>>
>>
>> On Mon, Oct 13, 2014 at 4:41 PM, Stack <st...@duboce.net> wrote:
>>
>>> A few notes on testing.
>>>
>>> Too long to read, infra is more capable now and after some work, we are
>>> seeing branch-1 and trunk mostly running blue. Lets try and keep it this
>>> way going forward.
>>>
>>> Apache Infra has new, more capable hardware.
>>>
>>> A recent spurt of test fixing combined with more capable hardware seems
>>> to have gotten us to a new place; tests are mostly passing now on branch-1
>>> and master.  Lets try and keep it this way and start to trust our test runs
>>> again.  Just a few flakies remain.  Lets try and nail them.
>>>
>>> Our tests now run in parallel with other test suites where previous we
>>> ran alone. You can see this sometimes when our zombie detector reports
>>> tests from another project altogether as lingerers (To be fixed).  Some of
>>> our tests are failing because a concurrent hbase run is undoing classes and
>>> data from under it. Also, lets fix.
>>>
>>> Our tests are brittle. It takes 75minutes for them to complete.  Many
>>> are heavy-duty integration tests starting up multiple clusters and
>>> mapreduce all in the one JVM. It is a miracle they pass at all.  Usually
>>> integration tests have been cast as unit tests because there was no where
>>> else for them to get an airing.  We have the hbase-it suite now which would
>>> be a more apt place but until these are run on a regular basis in public
>>> for all to see, the fat integration tests disguised as unit tests will
>>> remain.  A review of our current unit tests weeding the old cruft and the
>>> no longer relevant or duplicates would be a nice undertaking if someone is
>>> looking to contribute.
>>>
>>> Alex Newman has been working on making our tests work up on travis and
>>> circle-ci.  That'll be sweet when it goes end-to-end.  He also added in
>>> some "type" categorizations -- client, filter, mapreduce -- alongside our
>>> old "sizing" categorizations of small/medium/large.  His thinking is that
>>> we can run these categorizations in parallel so we could run the total
>>> suite in about the time of the longest test, say 20-30minutes?  We could
>>> even change Apache to run them this way.
>>>
>>> FYI,
>>> St.Ack
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

Reply via email to