Thanks for the KIP. The idea to add InputTopic and OutputTopic abstractions is really neat!
Couple of minor comment: (1) It's a little confusing that you list all method (existing, proposed to deprecate, and new one) of `TopologyTestDriver` in the KIP. Maybe only list the ones you propose to deprecate and the new ones you want to add? (Or mark all existing methods clearly -- atm, I need to got back to the code to read the KIP and to extract what changes are proposed). (2) `TopologyTestDriver#createInputTopic`: might it be worth to add overload to initialize the timetamp and auto-advance feature directly? Otherwise, uses always need to call `configureTiming` as an extra call? (3) `TestInputTopic#configureTiming()`: maybe rename to `reconfigureTiming()` ? (4) Should we switch from `long` for timestamps to `Instant` and `Duration` ? (5) Why do we have redundant getters? Or set with `getX()` and one set without `get`-prefix? -Matthias On 6/21/19 10:57 AM, Bill Bejeck wrote: > Jukka, > > Thanks for the KIP. I like the changes overall. > One thing I wanted to confirm, and this may be me being paranoid, but will > the changes for input/output topic affect how the TopologyTestDriver works > with internal topics when there are sub-topologies created? > > On Fri, Jun 21, 2019 at 12:05 PM Guozhang Wang <wangg...@gmail.com> wrote: > >> 1) Got it, could you list this class along with all its functions in the >> proposed public APIs as well? >> >> 2) Ack, thanks! >> >> On Thu, Jun 20, 2019 at 11:27 PM Jukka Karvanen < >> jukka.karva...@jukinimi.com> >> wrote: >> >>> Hi Guozhang, >>> >>> 1) This TestRecord is new class in my proposal. So it is a simplified >>> version of ProducerRecord and ConsumerRecord containing only the fields >>> needed to test record content. >>> >>> 2) >>> public final <K, V> TestInputTopic<K, V> createInputTopic(final String >>> topicName, final Serde<K> keySerde, final Serde<V> valueSerde); >>> public final <K, V> TestOutputTopic<K, V> createOutputTopic(final String >>> topicName, final Serde<K> keySerde, final Serde<V> valueSerde); >>> The purpose is to create separate object for each input and output topic >>> you are using. The topic name is given to createInput/OutputTopic when >>> initialize topic object. >>> >>> For example: >>> >>> final TestInputTopic<Long, String> inputTopic1 = >>> testDriver.createInputTopic( INPUT_TOPIC, longSerde, stringSerde); >>> final TestInputTopic<Long, String> inputTopic2 = >>> testDriver.createInputTopic( INPUT_TOPIC_MAP, longSerde, stringSerde); >>> final TestOutputTopic<Long, String> outputTopic1 = >>> testDriver.createOutputTopic(OUTPUT_TOPIC, longSerde, stringSerde); >>> final TestOutputTopic<String, Long> outputTopic2 = >>> testDriver.createOutputTopic(OUTPUT_TOPIC_MAP, stringSerde, >>> longSerde); >>> inputTopic1.pipeInput(1L, "Hello"); >>> assertThat(outputTopic1.readKeyValue(), equalTo(new KeyValue<>(1L, >>> "Hello"))); >>> assertThat(outputTopic2.readKeyValue(), equalTo(new KeyValue<>("Hello", >>> 1L))); >>> inputTopic2.pipeInput(1L, "Hello"); >>> >>> >>> Jukka >>> >>> to 20. kesäk. 2019 klo 23.52 Guozhang Wang (wangg...@gmail.com) >> kirjoitti: >>> >>>> Hello Jukka, >>>> >>>> Thanks for writing the KIP, I have a couple of quick questions: >>>> >>>> 1) Is "TestRecord" an existing class that you propose to piggy-back on? >>>> Right now we have a scala TestRecord case class but I doubt that was >> your >>>> proposal, or are you proposing to add a new Java class? >>>> >>>> 2) Would the new API only allow a single input / output topic with >>>> `createInput/OutputTopic`? If not, when we call pipeInput how to >>> determine >>>> which topic this record should be pipe to? >>>> >>>> >>>> Guozhang >>>> >>>> On Mon, Jun 17, 2019 at 1:34 PM John Roesler <j...@confluent.io> >> wrote: >>>> >>>>> Woah, I wasn't aware of that Hamcrest test style. Awesome! >>>>> >>>>> Thanks for the updates. I look forward to hearing what others think. >>>>> >>>>> -John >>>>> >>>>> On Mon, Jun 17, 2019 at 4:12 AM Jukka Karvanen >>>>> <jukka.karva...@jukinimi.com> wrote: >>>>>> >>>>>> Wiki page updated: >>>>>> >>>>> >>>> >>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-470%3A+TopologyTestDriver+test+input+and+output+usability+improvements >>>>>> >>>>>> >>>>>> ClientRecord removed and replaced with TestRecord in method calls. >>>>>> TestRecordFactory removed (time tracking functionality to be >> included >>>> to >>>>>> TestInputTopic) >>>>>> OutputVerifier deprecated >>>>>> TestRecord topic removed and getters added >>>>>> >>>>>> Getters in TestRecord enable writing test ignoring selected fields >>> with >>>>>> hamcrest like this: >>>>>> >>>>>> assertThat(outputTopic.readRecord(), allOf( >>>>>> hasProperty("key", equalTo(1L)), >>>>>> hasProperty("value", equalTo("Hello")), >>>>>> hasProperty("headers", equalTo(headers)))); >>>>>> >>>>>> >>>>>> Jukka >>>>>> >>>>>> la 15. kesäk. 2019 klo 1.10 John Roesler (j...@confluent.io) >>>> kirjoitti: >>>>>> >>>>>>> Sounds good. Thanks as always for considering my feedback! >>>>>>> -John >>>>>>> >>>>>>> On Fri, Jun 14, 2019 at 12:12 PM Jukka Karvanen >>>>>>> <jukka.karva...@jukinimi.com> wrote: >>>>>>>> >>>>>>>> Ok, I will modify KIP Public Interface in a wiki based on the >>>>> feedback. >>>>>>>> >>>>>>>> TestRecordFactory / ConsumerRecordFactory was used by >>>> TestInputTopic >>>>> with >>>>>>>> the version I had with KIP456, but maybe I can merge That >>>>> functionality >>>>>>> to >>>>>>>> InputTopic or TestRecordFactory can kept non public maybe >>> moving >>>>> it to >>>>>>>> internals package. >>>>>>>> >>>>>>>> I will make the proposal with a slim down interface. >>>>>>>> I don't want to go to so slim as you proposed with only >>> TestRecord >>>> or >>>>>>>> List<TestRecord>, because you then still end up doing helper >>>> methods >>>>> to >>>>>>>> construct List of TestRecord. >>>>>>>> The list of values is easier to write and clearer to read than >> if >>>> you >>>>>>> need >>>>>>>> to contruct list of TestRecords. >>>>>>>> >>>>>>>> For example: >>>>>>>> >>>>>>>> final List<String> inputValues = Arrays.asList( >>>>>>>> "Apache Kafka Streams Example", >>>>>>>> "Using Kafka Streams Test Utils", >>>>>>>> "Reading and Writing Kafka Topic" >>>>>>>> ); >>>>>>>> inputTopic.pipeValueList(inputValues); >>>>>>>> >>>>>>>> >>>>>>>> Let's check after the next iteration is it still worth reducing >>> the >>>>>>> methods. >>>>>>>> >>>>>>>> >>>>>>>> Jukka >>>>>>>> >>>>>>>> >>>>>>>> pe 14. kesäk. 2019 klo 18.27 John Roesler (j...@confluent.io) >>>>> kirjoitti: >>>>>>>> >>>>>>>>> Thanks, Jukka, >>>>>>>>> >>>>>>>>> Ok, I buy this reasoning. >>>>>>>>> >>>>>>>>> Just to echo what I think I read, you would just drop >>>> ClientRecord >>>>>>>>> from the proposal, and TestRecord would stand on its own, >> with >>>> the >>>>>>>>> same methods and properties you proposed, and the "input >> topic" >>>>> would >>>>>>>>> accept TestRecord, and the "output topic" would produce >>>> TestRecord? >>>>>>>>> Further, the "input and output topic" classes would >> internally >>>>> handle >>>>>>>>> the conversion to and from ConsumerRecord and ProducerRecord >> to >>>>> pass >>>>>>>>> to and from the TopologyTestDriver? >>>>>>>>> >>>>>>>>> This seems good to me. >>>>>>>>> >>>>>>>>> Since the object coming out of the "output topic" is much >> more >>>>>>>>> ergonomic, I suspect we won't need the OutputVerifier at all. >>> It >>>>> was >>>>>>>>> mostly needed because of all the extra junk in ProducerRecord >>> you >>>>> need >>>>>>>>> to ignore. It seems better to just deprecate it. If in the >>> future >>>>> it >>>>>>>>> turns out there is some actual use case for a verifier, we >> can >>>>> have a >>>>>>>>> very small KIP to add one. But reading your response, I >> suspect >>>>> that >>>>>>>>> existing test verification libraries would be sufficient on >>> their >>>>> own. >>>>>>>>> >>>>>>>>> Similarly, it seems like we can shrink the total interface by >>>>> removing >>>>>>>>> the TestRecordFactory from the proposal. If TestRecord >> already >>>>> offers >>>>>>>>> all the constructors you'd want, then the only benefit of the >>>>> factory >>>>>>>>> is to auto-increment the timestamps, but then again, the >> "input >>>>> topic" >>>>>>>>> can already do that (e.g., it can do it if the record >> timestamp >>>> is >>>>> not >>>>>>>>> set). >>>>>>>>> >>>>>>>>> Likewise, if the TestRecords are easy to create, then we >> don't >>>> need >>>>>>>>> all the redundant methods in "input topic" to pipe values, or >>>>>>>>> key/values, or key/value/timestamp, etc. We can do with just >>> two >>>>>>>>> methods, one for a single TestRecord and one for a collection >>> of >>>>> them. >>>>>>>>> This reduces API ambiguity and also dramatically decreases >> the >>>>> surface >>>>>>>>> area of the interface, which ultimately makes it much easier >> to >>>>> use. >>>>>>>>> >>>>>>>>> It's best to start with the smallest interface that will do >> the >>>> job >>>>>>>>> and expand it upon request, rather than throwing in >> everything >>>> you >>>>> can >>>>>>>>> think of up front. The extra stuff would be confusing to >> people >>>>> facing >>>>>>>>> two practically identical paths to accomplish the same goal, >>> and >>>>> it's >>>>>>>>> very difficult to slim the interface down later, because we >>> don't >>>>>>>>> really know which parts are more popular (i.e., no one >> submits >>>>>>>>> "feature requests" to _remove_ stuff they don't need, only to >>>> _add_ >>>>>>>>> stuff that they need. >>>>>>>>> >>>>>>>>> But overall, I really like the structure of this design. I'm >>>> super >>>>>>>>> excited about this KIP. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> -John >>>>>>>>> >>>>>>>>> On Fri, Jun 14, 2019 at 2:55 AM Jukka Karvanen >>>>>>>>> <jukka.karva...@jukinimi.com> wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I am not a fan of swapping only ProducerRecord and >>>>> ConsumerRecord. >>>>>>>>>> As a test writer point of view I do not want to care about >>> the >>>>>>> difference >>>>>>>>>> of those and >>>>>>>>>> that way I like to have object type which can be used to >> pipe >>>>>>> records in >>>>>>>>>> and compare outputs. >>>>>>>>>> That way avoid unnecessary conversions between >> ProducerRecord >>>> and >>>>>>>>>> ConsumerRecord. >>>>>>>>>> >>>>>>>>>> My initial assumption was that ProducerRecord and >>>>>>> ConsumerRecord.could >>>>>>>>>> implement the same ClientRecord >>>>>>>>>> and that way test writer could have used either of those, >> but >>>>> seems >>>>>>> that >>>>>>>>>> return type of timestamp method long vs Long is not >>> compatible. >>>>>>>>>> Now the main advantage of ClientRecord is no need to >>> duplicate >>>>>>>>>> OutputVerifier when it is modified from ProducerRecord to >>>>>>> ClientRecord. >>>>>>>>>> Generally is there need for OutputVerifier. Can we >> deprecate >>>> the >>>>>>> existing >>>>>>>>>> and use standard assertion libraries for new test. >>>>>>>>>> >>>>>>>>>> If you use hamcrest, assert-j or any other assertion >> library >>>>> for the >>>>>>>>> rest >>>>>>>>>> of the test, why not use it with these also. >>>>>>>>>> When we have these methods to access only needed fields it >> is >>>>> easier >>>>>>> to >>>>>>>>>> write test like this: >>>>>>>>>> assertThat(outputTopic.readValue()).isEqualTo("Hello"); >>>>>>>>>> >>>>>>>>>> or >>>>>>>>>> >>> assertThat(outputTopic.readRecord()).isEqualTo(expectedRecord); >>>>>>>>>> >>>>>>>>>> Only value for new OutputVerifier is when needing to ignore >>>> some >>>>>>> fields >>>>>>>>>> ClientRecord actual = outputTopic.readRecord(); >>>>>>>>>> assertThat(actual.value()).isEqualTo("Hello"); >>>>>>>>>> assertThat(actual.key()).isEqualTo(expectedKey); >>>>>>>>>> >> assertThat(actual.timestamp()).isEqualTo(expectedTimestamp); >>>>>>>>>> >>>>>>>>>> So if want to leave client package untouched, I would >> modify >>>> the >>>>>>> methods >>>>>>>>>> with ClientRecord now in InputTopic and OutputTopic to pass >>> in >>>>> and >>>>>>> out >>>>>>>>> this >>>>>>>>>> TestRecord instead. >>>>>>>>>> In that case there would be possibility to add methods to >>>>> TestRecord >>>>>>> to >>>>>>>>>> help ignore some fields in assertions like: >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>>> >>> >> assertThat(outputTopic.readRecord().getValueTimestamp()).isEqualTo(expectedRecord.get >>>>>>>>>> ValueTimestamp()); >>>>>>>>>> >>>>>>>>>> How about this alternative? >>>>>>>>>> If this way sounds better I will modify KIP page in wiki. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Jukka >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> to 13. kesäk. 2019 klo 18.30 John Roesler ( >> j...@confluent.io >>> ) >>>>>>> kirjoitti: >>>>>>>>>> >>>>>>>>>>> Hey, all, maybe we can jump-start this discussion. >>>>>>>>>>> >>>>>>>>>>> I think this approach would be very ergonomic for >> testing, >>>> and >>>>>>> would >>>>>>>>>>> help reduce boilerplate in tests. >>>>>>>>>>> >>>>>>>>>>> The think I like most about it is that it mirrors the >>> mental >>>>> model >>>>>>>>>>> that people already have from Kafka Streams, in which you >>>>> write to >>>>>>> an >>>>>>>>>>> "input topic" and then get your results from an "output >>>>> topic". As >>>>>>> a >>>>>>>>>>> side benefit, the input and output topics in the proposal >>>> also >>>>>>> close >>>>>>>>>>> over the serdes, which makes it much less boilerplate for >>>> test >>>>>>> code. >>>>>>>>>>> >>>>>>>>>>> If I can offer one suggestion for change, I'm not sure >> I'm >>>>> totally >>>>>>>>>>> sold on the need for a new abstraction "ClientRecord" >> with >>> an >>>>>>>>>>> implementation for tests "TestRecord". It seems like this >>>>>>>>>>> unnecessarily complicates the main (non-testing) data >>> model. >>>> It >>>>>>> seems >>>>>>>>>>> like it would be sufficient, and just as ergonomic, to >> have >>>> the >>>>>>> input >>>>>>>>>>> topic accept ProducerRecords and the output topic return >>>>>>>>>>> ConsumerRecords. I'm open to discussion on this point, >>>> though. >>>>>>>>>>> >>>>>>>>>>> Thanks for the proposal, Jukka! >>>>>>>>>>> -John >>>>>>>>>>> >>>>>>>>>>> On Mon, May 20, 2019 at 7:59 AM Jukka Karvanen >>>>>>>>>>> <jukka.karva...@jukinimi.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi All, >>>>>>>>>>>> >>>>>>>>>>>> I would like to start the discussion on KIP-470: >>>>>>> TopologyTestDriver >>>>>>>>> test >>>>>>>>>>>> input and output usability improvements: >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>>> >>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-470%3A+TopologyTestDriver+test+input+and+output+usability+improvements >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> This KIP is inspired by the Discussion in KIP-456: >> Helper >>>>>>> classes to >>>>>>>>> make >>>>>>>>>>>> it simpler to write test logic with TopologyTestDriver: >>>>>>>>>>>> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-456 >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>>> >>> >> %3A+Helper+classes+to+make+it+simpler+to+write+test+logic+with+TopologyTestDriver >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> The proposal in KIP-456 >>>>>>>>>>>> < >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>>> >>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-456%3A+Helper+classes+to+make+it+simpler+to+write+test+logic+with+TopologyTestDriver >>>>>>>>>>>> >>>>>>>>>>>> was >>>>>>>>>>>> to add alternate way to input and output topic, but >> this >>>> KIP >>>>>>> enhance >>>>>>>>>>> those >>>>>>>>>>>> classes and deprecate old functionality to make clear >>>>> interface >>>>>>> for >>>>>>>>> test >>>>>>>>>>>> writer to use. >>>>>>>>>>>> >>>>>>>>>>>> In current KIP-470 proposal, topic objects are created >>> with >>>>>>>>> topicName and >>>>>>>>>>>> related serders. >>>>>>>>>>>> public final <K, V> TestInputTopic<K, V> >>>>>>> createInputTopic(final >>>>>>>>>>> String >>>>>>>>>>>> topicName, final Serde<K> keySerde, final Serde<V> >>>>> valueSerde); >>>>>>>>>>>> public final <K, V> TestOutputTopic<K, V> >>>>>>> createOutputTopic(final >>>>>>>>>>> String >>>>>>>>>>>> topicName, final Serde<K> keySerde, final Serde<V> >>>>> valueSerde); >>>>>>>>>>>> One thing I wondered if there way to find out topic >>> related >>>>> serde >>>>>>>>> from >>>>>>>>>>>> TopologyTestDriver topology, it would simply creation >> of >>>>> these >>>>>>> Topic >>>>>>>>>>>> objects: >>>>>>>>>>>> public final <K, V> TestInputTopic<K, V> >>>>>>> createInputTopic(final >>>>>>>>>>> String >>>>>>>>>>>> topicName); >>>>>>>>>>>> public final <K, V> TestOutputTopic<K, V> >>>>>>> createOutputTopic(final >>>>>>>>>>> String >>>>>>>>>>>> topicName); >>>>>>>>>>>> >>>>>>>>>>>> KIP-456 can be discarded if this KIP get accepted. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Best Regards, >>>>>>>>>>>> Jukka >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>>> >>>> >>>> -- >>>> -- Guozhang >>>> >>> >> >> >> -- >> -- Guozhang >> >
signature.asc
Description: OpenPGP digital signature