Mathieu, Your composition of Per-module Unit Tests + ProcessorTopologyTestDriver + System Tests looks good to me, and I agree with you that since this is part of your pre-commit process, which could be triggered concurrently from different developers / teams, EmbeddedSingleNodeKafkaCluster + EmbeddedZookeeper may not work best for you.
Guozhang On Mon, Aug 15, 2016 at 1:39 PM, Radoslaw Gruchalski <ra...@gruchalski.com> wrote: > Out of curiosity, are you aware of kafka.util.TestUtils and Apache Curator > TestingServer? > I’m using this successfully to test publis / consume scenarios with things > like Flink, Spark and custom apps. > What would stop you from taking the same approach? > > – > Best regards, > Radek Gruchalski > ra...@gruchalski.com > > > On August 15, 2016 at 9:41:37 PM, Mathieu Fenniak ( > mathieu.fenn...@replicon.com) wrote: > > Hi Michael, > > It would definitely be an option. I am not currently doing any testing > like that; it could replace the ProcessorTopologyTestDriver-style testing > that I'd like to do, but there are some trade-offs to consider: > > - I can't do an isolated test of just the TopologyBuilder; I'd be > bringing in configuration management code (eg. configuring where to access > ZK + Kafka). > - Tests using a running Kafka server wouldn't have a clear end-point; if > something in the toplogy doesn't publish a message where I expected it to, > my test can only fail via a timeout. > - Tests are likely to be slower; this might not be significant, but a > small difference in test speed has a big impact in productivity after a few > months of development > - Tests will be more complex & fragile; some additional component needs > to manage starting up that Kafka server, making sure it's ready-to-go, > running tests, and then tearing it down > - Tests will have to be cautious of state existing in Kafka. eg. two > test suites that touch the same topics could be influenced by state of a > previous test. Either you take a "destroy the world" approach between test > cases (or test suites), which probably makes test speed much worse, or, you > find another way to isolate test's state. > > I'd have to face all these problems at the higher level that I'm calling > "systems-level tests", but, I think it would be better to do the majority > of the automated testing at a lower level that doesn't bring these > considerations into play. > > Mathieu > > > On Mon, Aug 15, 2016 at 12:13 PM, Michael Noll <mich...@confluent.io> > wrote: > > > Mathieu, > > > > follow-up question: Are you also doing or considering integration testing > > by spawning a local Kafka cluster and then reading/writing to that > cluster > > (often called embedded or in-memory cluster)? This approach would be in > > the middle between ProcessorTopologyTestDriver (that does not spawn a > Kafka > > cluster) and your system-level testing (which I suppose is running > against > > a "real" test Kafka cluster). > > > > -Michael > > > > > > > > > > > > On Mon, Aug 15, 2016 at 3:44 PM, Mathieu Fenniak < > > mathieu.fenn...@replicon.com> wrote: > > > > > Hey all, > > > > > > At my workplace, we have a real focus on software automated testing. > I'd > > > love to be able to test the composition of a TopologyBuilder with > > > org.apache.kafka.test.ProcessorTopologyTestDriver > > > <https://github.com/apache/kafka/blob/14934157df7aaf5e9c37a302ef9fd9 > > > 317b95efa4/streams/src/test/java/org/apache/kafka/test/ > > > ProcessorTopologyTestDriver.java>; > > > has there ever been any thought given to making this part of the public > > API > > > of Kafka Streams? > > > > > > For some background, here are some details on the automated testing > plan > > > that I have in mind for a Kafka Streams application. Our goal is to > > enable > > > continuous deployment of any new development we do, so, it has to be > > > rigorously tested with complete automation. > > > > > > As part of our pre-commit testing, we'd first have these gateways; no > > code > > > would reach our master branch without passing these tests: > > > > > > - At the finest level, unit tests covering individual pieces like a > > > Serde, ValueMapper, ValueJoiner, aggregate adder/subtractor, etc. > > These > > > pieces are very isolated, very easy to unit test. > > > - At a higher level, I'd like to have component tests of the > > composition > > > of the TopologyBuilder; this is where ProcessorTopologyTestDriver > > would > > > be > > > valuable. There'd be far fewer of these tests than the lower-level > > > tests. > > > There are no external dependencies to these tests, so they'd be very > > > fast. > > > > > > Having passed that level of testing, we'd deploy the Kafka Streams > > > application to an integration testing area where the rest of our > > > application is kept up-to-date, and proceed with these integration > tests: > > > > > > - Systems-level tests where we synthesize inputs to the Kafka topics, > > > wait for the Streams app to process the data, and then inspect the > > > output > > > that it pushes into other Kafka topics. These tests will be fewer in > > > nature than the above tests, but they serve to ensure that the > > > application > > > is well-configured, executing, and handling inputs & outputs as > > > expected. > > > - UI-level tests where we verify behaviors that are expected from the > > > system as a whole. As our application is a web app, we'd be using > > > Selenium > > > to drive a web browser and verifying interactions and outputs that are > > > expected from the Streams application matching our real-world > > use-cases. > > > These tests are even fewer in nature than the above. > > > > > > This is an adaptation of the automated testing scaffold that we > currently > > > use for microservices; I'd love any input on the plan as a whole. > > > > > > Thanks, > > > > > > Mathieu > > > > > > -- -- Guozhang