Hi, according to the feedback, I updated the KIP, and limited its scope to some extend:
- Instead of changing the creation of KafkaStreams instances, we keep the current pattern (we might do a follow up KIP on this though). - We also added a new method #describe() that returns a TopologyDescription that can be used for any "pretty print" functionality or similar. - We also removed method #topologyBuilder() from KStreamBuilder because we think #transform() should provide all functionality you need to mix-an-match Processor API and DSL. If there is any further concern about this, please let us know. -Matthias On 2/14/17 9:59 AM, Matthias J. Sax wrote: > You can already output any number of record within .transform() using > the provided Context object from init()... > > > -Matthias > > On 2/14/17 9:16 AM, Guozhang Wang wrote: >>> and you can't output multiple records or branching logic from a >> transform(); >> >> For output multiple records in transform, we are currently working on >> https://issues.apache.org/jira/browse/KAFKA-4217, I think that should cover >> this use case. >> >> For branching the output in transform, I agree this is not perfect but I >> think users can follow some patterns like "stream.transform().branch()", >> would that work for you? >> >> >> Guozhang >> >> >> On Tue, Feb 14, 2017 at 8:29 AM, Mathieu Fenniak < >> mathieu.fenn...@replicon.com> wrote: >> >>> On Tue, Feb 14, 2017 at 1:14 AM, Guozhang Wang <wangg...@gmail.com> wrote: >>> >>>> Some thoughts on the mixture usage of DSL / PAPI: >>>> >>>> There were some suggestions on mixing the usage of DSL and PAPI: >>>> https://issues.apache.org/jira/browse/KAFKA-3455, and after thinking it >>> a >>>> bit more carefully, I'd rather not recommend users following this >>> pattern, >>>> since in DSL this can always be achieved in process() / transform(). >>> Hence >>>> I think it is okay to prevent such patterns in the new APIs. And for the >>>> same reasons, I think we can remove KStreamBuilder#newName() from the >>>> public APIs. >>>> >>> >>> I'm not sure that things can always be achieved by process() / >>> transform()... there are some limitations to these APIs. You can't output >>> from a process(), and you can't output multiple records or branching logic >>> from a transform(); these are things that can be done in the PAPI quite >>> easily. >>> >>> I definitely understand a preference for using process()/transform() where >>> possible, but, they don't seem to replace the PAPI. >>> >>> I would love to be operating in a world that was entirely DSL. But the DSL >>> is limited, and it isn't extensible (... by any stable API). I don't mind >>> reaching into internals today and making my own life difficult to extend >>> it, and I'd continue to find a way to do that if you made the APIs distinct >>> and split, but I'm just expressing my preference that you not do that. :-) >>> >>> And about printing the topology for debuggability: I agrees this is a >>>> potential drawback, and I'd suggest maintain some functionality to build >>> a >>>> "dry topology" as Mathieu suggested; the difficulty is that, internally >>> we >>>> need a different "copy" of the topology for each thread so that they will >>>> not share any states, so we cannot directly pass in the topology into >>>> KafkaStreams instead of the topology builder. So how about adding a >>>> `ToplogyBuilder#toString` function which calls `build()` internally then >>>> prints the built dry topology? >>>> >>> >>> Well, this sounds better than KafkaStreams#toString() in that it doesn't >>> require a running processor. But I'd really love to have a simple object >>> model for the topology, not a string output, so that I can output my own >>> debug format. I currently have that in the form of >>> TopologyBuilder#nodeGroups() & TopologyBuilder#build(Integer). >>> >>> Mathieu >>> >> >> >> >
signature.asc
Description: OpenPGP digital signature