Notice: I'm messing with test-patch.sh reporting trying to improve the zombie section. I'll likely break things for a while (I already have -- the hadoopqa report section is curtailed at mo). Will flag when done. St.Ack
On Wed, Dec 2, 2015 at 1:22 PM, Stack <st...@duboce.net> wrote: > As part of my continuing advocacy of builds.apache.org and that their > results are now worthy of our trust and nurture, here are some highlights > from the last few days of builds: > > + hadoopqa is now finding zombies before the patch is committed. > HBASE-14888 showed "-1 core tests. The patch failed these unit tests:" but > didn't have any failed tests listed (I'm trying to see if I can do anything > about this...). Running our little ./dev-tools/findHangingTests.py against > the consoleText, it showed a hanging test. Running locally, I see same > hang. This is before the patch landed. > + Our branch runs are now near totally zombie and flakey free -- still > some work to do -- but a recent patch that seemed harmless was causing a > reliable flake fail in the backport to branch-1* confirmed by local runs. > The flakeyness was plain to see up in builds.apache.org. > + In the last few days I've committed a patch that included javadoc > warnings even though hadoopqa said the patch introduced javadoc issues (I > missed it). This messed up life for folks subsequently as their patches now > reported javadoc issues.... > > In short, I suggest that builds.apache.org is worth keeping an eye on, > make sure you get a clean build out of hadoopqa before committing anything, > and lets all work together to try and keep our builds blue: it'll save us > all work in the long run. > > St.Ack > > > On Tue, Nov 4, 2014 at 9:38 AM, Stack <st...@duboce.net> wrote: > >> Branch-1 and master have stabilized and now run mostly blue (give or take >> the odd failure) [1][2]. Having a mostly blue branch-1 has helped us >> identify at least one destabilizing commit in the last few days, maybe two; >> this is as it should be (smile). >> >> Lets keep our builds blue. If you commit a patch, make sure subsequent >> builds stay blue. You can subscribe to bui...@hbase.apache.org to get >> notice of failures if not already subscribed. >> >> Thanks, >> St.Ack >> >> 1. https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/ >> 2. https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/ >> >> >> On Mon, Oct 13, 2014 at 4:41 PM, Stack <st...@duboce.net> wrote: >> >>> A few notes on testing. >>> >>> Too long to read, infra is more capable now and after some work, we are >>> seeing branch-1 and trunk mostly running blue. Lets try and keep it this >>> way going forward. >>> >>> Apache Infra has new, more capable hardware. >>> >>> A recent spurt of test fixing combined with more capable hardware seems >>> to have gotten us to a new place; tests are mostly passing now on branch-1 >>> and master. Lets try and keep it this way and start to trust our test runs >>> again. Just a few flakies remain. Lets try and nail them. >>> >>> Our tests now run in parallel with other test suites where previous we >>> ran alone. You can see this sometimes when our zombie detector reports >>> tests from another project altogether as lingerers (To be fixed). Some of >>> our tests are failing because a concurrent hbase run is undoing classes and >>> data from under it. Also, lets fix. >>> >>> Our tests are brittle. It takes 75minutes for them to complete. Many >>> are heavy-duty integration tests starting up multiple clusters and >>> mapreduce all in the one JVM. It is a miracle they pass at all. Usually >>> integration tests have been cast as unit tests because there was no where >>> else for them to get an airing. We have the hbase-it suite now which would >>> be a more apt place but until these are run on a regular basis in public >>> for all to see, the fat integration tests disguised as unit tests will >>> remain. A review of our current unit tests weeding the old cruft and the >>> no longer relevant or duplicates would be a nice undertaking if someone is >>> looking to contribute. >>> >>> Alex Newman has been working on making our tests work up on travis and >>> circle-ci. That'll be sweet when it goes end-to-end. He also added in >>> some "type" categorizations -- client, filter, mapreduce -- alongside our >>> old "sizing" categorizations of small/medium/large. His thinking is that >>> we can run these categorizations in parallel so we could run the total >>> suite in about the time of the longest test, say 20-30minutes? We could >>> even change Apache to run them this way. >>> >>> FYI, >>> St.Ack >>> >>> >>> >>> >>> >>> >>> >> >