Dear devs,

tl;dr: We now have Jenkins jobs
<https://builds.apache.org/job/HBase-master-IntegrationTestBigLinkedList/>
that can run IntegrationTestBigLinkedList with fault injection on 5-node
Apache HBase clusters built from source.

Long version:

I just wanted to provide an update on some recent work we've gotten done
since committing an Apache HBase topology for clusterdock
<https://github.com/apache/hbase/commit/ccf5d27d7aa238c8398d2818928a71f39bd749a0>
(a Python-based framework for building and starting Docker container-based
clusters).

Despite the existence of an awesome system test framework with
fault-injection capabilities in the form of the hbase-it module, we've
never had an easy way to run these tests on distributed clusters upstream.
This has long been a big hole in our Jenkins test coverage, but since the
clusterdock topology got committed, we've been making progress on doing
something about it. I'm happy to report that, starting today, we are now
running IntegrationTestBigLinkedList with fault-injection on Apache
Infrastructure
<https://builds.apache.org/job/HBase-master-IntegrationTestBigLinkedList/>.

Even longer version (stop reading here if you don't care how we do it):

So how do we do it? Well clusterdock is designed to start up multiple
Docker containers on one host where each containers acts like a lightweight
VM (so 4 containers = 4-node cluster). What's in these containers (and what
to do when starting them) is controlled by clusterdock's "topology"
abstraction. Our apache_hbase topology builds a Docker image from a Java
tarball, Hadoop tarball, and an HBase version. This last part can be either
a binary tarball (for RC testing or playing around with a release) or a Git
commit, in which case our clusterdock topology builds HBase from source.
Once we build a cluster, we can then push the cluster images (actually,
just one Docker image) to a shared Docker registry for repeated use. We now
have a matrix job that can build any branches we care about (I set it up
against branch-1.2
<https://builds.apache.org/view/H-L/view/HBase/job/HBase-Build-clusterdock-Clusters/HBASE_VERSION=branch-1.2,label=docker/>,
branch-1.3
<https://builds.apache.org/view/H-L/view/HBase/job/HBase-Build-clusterdock-Clusters/HBASE_VERSION=branch-1.3,label=docker/>,
and master
<https://builds.apache.org/view/H-L/view/HBase/job/HBase-Build-clusterdock-Clusters/HBASE_VERSION=master,label=docker/>
to start) and do this.

Once these images are built (and pushed), we can use them to start up an
n-node sized cluster on one host and run tests against it. To begin, I've
set up a super simple Jenkins job that starts up a 5-node cluster, runs
ITBLL (with an optional Chaos Monkey), and then exits.

This work is being tracked in HBASE-15964 and there's much more that I want
to do (more tests, more Chaos Monkeys, more branches, more diagnostic
information collection when a test fails), but I figured I'd let you guys
know about what have going so far. :)

PS: Special thanks to Jon Hsieh for helping me get the Jenkins jobs running.

-- 
-Dima

Reply via email to