This is great.

To completely retrace a rare botch we may need persisted post run:
- The console log of the rum
- All daemon logs
- All WALs
- All HFiles
WALs and HFiles should be be organized by time from oldest to newest.

All could reside in a S3 bucket.



On Tue, Aug 23, 2016 at 12:26 AM, Dima Spivak <dspi...@cloudera.com> wrote:

> Yep, that's the next improvement I plan on making. Docker has API endpoints
> for copying files from a container to the host, so I can definitely use
> that to move logs from the cluster to the Jenkins workspace if a test
> fails.
>
> On Monday, August 22, 2016, Nick Dimiduk <ndimi...@gmail.com> wrote:
>
> > This sounds great! Is there a way to gather logs and/or data files from
> the
> > containers before termination? Can they be stored on Jenkins as part of
> the
> > job artifacts?
> >
> > On Monday, August 22, 2016, Ted Yu <yuzhih...@gmail.com <javascript:;>>
> > wrote:
> >
> > > Nice job, Dima.
> > >
> > > Is there Jenkins job for running ITBLL for 1.2 / 1.3 branches ?
> > >
> > > Cheers
> > >
> > > On Mon, Aug 22, 2016 at 5:33 PM, Dima Spivak <dspi...@cloudera.com
> > <javascript:;>
> > > <javascript:;>> wrote:
> > >
> > > > Dear devs,
> > > >
> > > > tl;dr: We now have Jenkins jobs
> > > > <https://builds.apache.org/job/HBase-master-
> > > IntegrationTestBigLinkedList/>
> > > > that can run IntegrationTestBigLinkedList with fault injection on
> > 5-node
> > > > Apache HBase clusters built from source.
> > > >
> > > > Long version:
> > > >
> > > > I just wanted to provide an update on some recent work we've gotten
> > done
> > > > since committing an Apache HBase topology for clusterdock
> > > > <https://github.com/apache/hbase/commit/
> ccf5d27d7aa238c8398d2818928a71
> > > > f39bd749a0>
> > > > (a Python-based framework for building and starting Docker
> > > container-based
> > > > clusters).
> > > >
> > > > Despite the existence of an awesome system test framework with
> > > > fault-injection capabilities in the form of the hbase-it module,
> we've
> > > > never had an easy way to run these tests on distributed clusters
> > > upstream.
> > > > This has long been a big hole in our Jenkins test coverage, but since
> > the
> > > > clusterdock topology got committed, we've been making progress on
> doing
> > > > something about it. I'm happy to report that, starting today, we are
> > now
> > > > running IntegrationTestBigLinkedList with fault-injection on Apache
> > > > Infrastructure
> > > > <https://builds.apache.org/job/HBase-master-
> > > IntegrationTestBigLinkedList/>
> > > > .
> > > >
> > > > Even longer version (stop reading here if you don't care how we do
> it):
> > > >
> > > > So how do we do it? Well clusterdock is designed to start up multiple
> > > > Docker containers on one host where each containers acts like a
> > > lightweight
> > > > VM (so 4 containers = 4-node cluster). What's in these containers
> (and
> > > what
> > > > to do when starting them) is controlled by clusterdock's "topology"
> > > > abstraction. Our apache_hbase topology builds a Docker image from a
> > Java
> > > > tarball, Hadoop tarball, and an HBase version. This last part can be
> > > either
> > > > a binary tarball (for RC testing or playing around with a release)
> or a
> > > Git
> > > > commit, in which case our clusterdock topology builds HBase from
> > source.
> > > > Once we build a cluster, we can then push the cluster images
> (actually,
> > > > just one Docker image) to a shared Docker registry for repeated use.
> We
> > > now
> > > > have a matrix job that can build any branches we care about (I set it
> > up
> > > > against branch-1.2
> > > > <https://builds.apache.org/view/H-L/view/HBase/job/HBase-
> > > > Build-clusterdock-Clusters/HBASE_VERSION=branch-1.2,label=docker/>,
> > > > branch-1.3
> > > > <https://builds.apache.org/view/H-L/view/HBase/job/HBase-
> > > > Build-clusterdock-Clusters/HBASE_VERSION=branch-1.3,label=docker/>,
> > > > and master
> > > > <https://builds.apache.org/view/H-L/view/HBase/job/HBase-
> > > > Build-clusterdock-Clusters/HBASE_VERSION=master,label=docker/>
> > > > to start) and do this.
> > > >
> > > > Once these images are built (and pushed), we can use them to start up
> > an
> > > > n-node sized cluster on one host and run tests against it. To begin,
> > I've
> > > > set up a super simple Jenkins job that starts up a 5-node cluster,
> runs
> > > > ITBLL (with an optional Chaos Monkey), and then exits.
> > > >
> > > > This work is being tracked in HBASE-15964 and there's much more that
> I
> > > want
> > > > to do (more tests, more Chaos Monkeys, more branches, more diagnostic
> > > > information collection when a test fails), but I figured I'd let you
> > guys
> > > > know about what have going so far. :)
> > > >
> > > > PS: Special thanks to Jon Hsieh for helping me get the Jenkins jobs
> > > > running.
> > > >
> > > > --
> > > > -Dima
> > > >
> > >
> >
>
>
> --
> -Dima
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Reply via email to