I think I have mentioned it before, but https://www.testcontainers.org could be a viable approach for this methodology (3). I would think it would be worth looking at.
On March 8, 2019 at 09:47:54, Ryan Merriman (merrim...@gmail.com) wrote: I have been researching the effort involved to upgrade to HDP 3. Along the way I've found a couple challenging issues that we will need to solve, both involving our integration testing strategy. The first issue is Kafka. We are moving from 0.10.0 to 2.0.0 and there have been significant changes to the API. This creates an issue in the KafkaComponent class, which we use as an in-memory Kafka server in integration tests. Most of the classes that were previously used have gone away, and to the best of my knowledge, were not supported as public APIs. I also don't see any publicly documented APIs to replace them. The second issue is HBase. We are moving from 1.1.2 to 2.0.2 so another significant change. This creates an issue in the MockHTable class becausethe HTableInterface class has changed to Table, essentially requiring that MockHTable be rewritten to conform to the new interface. It's my opinion that this class is complicated and difficult to maintain as it is anyways. These 2 issues have the potential to add a significant amount of work to upgrading Metron to HDP 3. I want to take a step back and review our options before we move forward. Here are some initial thoughts I had on how to approach this. For HBase: 1. Update MockHTable to work with the new HBase API. We would continue using a mock server approach for HBase. 2. Research replacing MockHTable with an in-memory HBase server. 3. Replace MockHTable with a Docker container running HBase. For Kafka: 1. Replace KafkaComponent with a mock server implementation. 2. Update KafkaComponent to work with the new API. We would probably need to leverage some internal Kafka classes. I do not see a testing API documented publicly. 3. Replace KafkaComponent with a Docker container running Kafka. What other options are there? Whatever we choose I think we should follow a similar approach for both (mock servers, in memory servers, Docker, other options I'm not thinking of). This will not shock anyone but I would be in favor of Docker containers. They have the advantage of classpath isolation, easy upgrades, and accurate integration testing. The downside is we will have to adjusts our tests and travis script to incorporate these Docker containers into our build process. We have discussed this at length in the past and it has generally stalled for various reasons. Maybe if we move a few services at a time it might be more palatable? As for the other 2 approaches, I think if either worked well we wouldn't be having this discussion. Mock servers are hard to maintain and I don't see in memory testing classes documented in javadocs for either service. Thoughts?