Re: [DISCUSS] Upgrading HBase and Kafka support

Michael Miklavcic Fri, 08 Mar 2019 08:38:43 -0800

I'm -1 on #1 unless there's some desperately compelling reason to go that
route. It would be a regression in our test coverage, and at that point
it's really just duplicating our unit tests as opposed to checking our
integration.


I'm good with 3. Gating factors for a successful implementation would be
that as a developer I can:

   1. Run it in my IDE without having to do anything extra (the beauty of
   the in-mem component is that @BeforeClass spins it up automatically - we
   should keep doing something along those lines)
   2. Run it via Maven cli
   3. Run it in Travis as part of our normal build

It's probably worth looking at Kafka's testing infrastructure straight from
the source - https://github.com/apache/kafka/blob/trunk/tests/README.md.
They leverage Docker containers now for system tests.

Best,
Mike


On Fri, Mar 8, 2019 at 7:47 AM Ryan Merriman <[email protected]> wrote:

> I have been researching the effort involved to upgrade to HDP 3.  Along the
> way I've found a couple challenging issues that we will need to solve, both
> involving our integration testing strategy.
>
> The first issue is Kafka.  We are moving from 0.10.0 to 2.0.0 and there
> have been significant changes to the API.  This creates an issue in the
> KafkaComponent class, which we use as an in-memory Kafka server in
> integration tests.  Most of the classes that were previously used have gone
> away, and to the best of my knowledge, were not supported as public APIs.
> I also don't see any publicly documented APIs to replace them.
>
> The second issue is HBase.  We are moving from 1.1.2 to 2.0.2 so another
> significant change.  This creates an issue in the MockHTable class
> becausethe HTableInterface class has changed to Table, essentially
> requiring that MockHTable be rewritten to conform to the new interface.
> It's my opinion that this class is complicated and difficult to maintain as
> it is anyways.
>
> These 2 issues have the potential to add a significant amount of work to
> upgrading Metron to HDP 3.  I want to take a step back and review our
> options before we move forward.  Here are some initial thoughts I had on
> how to approach this.  For HBase:
>
>    1. Update MockHTable to work with the new HBase API.  We would continue
>    using a mock server approach for HBase.
>    2. Research replacing MockHTable with an in-memory HBase server.
>    3. Replace MockHTable with a Docker container running HBase.
>
> For Kafka:
>
>    1. Replace KafkaComponent with a mock server implementation.
>    2. Update KafkaComponent to work with the new API.  We would probably
>    need to leverage some internal Kafka classes.  I do not see a testing
> API
>    documented publicly.
>    3. Replace KafkaComponent with a Docker container running Kafka.
>
> What other options are there?  Whatever we choose I think we should follow
> a similar approach for both (mock servers, in memory servers, Docker, other
> options I'm not thinking of).
>
> This will not shock anyone but I would be in favor of Docker containers.
> They have the advantage of classpath isolation, easy upgrades, and accurate
> integration testing.  The downside is we will have to adjusts our tests and
> travis script to incorporate these Docker containers into our build
> process.  We have discussed this at length in the past and it has generally
> stalled for various reasons.  Maybe if we move a few services at a time it
> might be more palatable?  As for the other 2 approaches, I think if either
> worked well we wouldn't be having this discussion.  Mock servers are hard
> to maintain and I don't see in memory testing classes documented in
> javadocs for either service.
>
> Thoughts?
>

Re: [DISCUSS] Upgrading HBase and Kafka support

Reply via email to