Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API

Matthias J. Sax Tue, 14 Feb 2017 10:00:16 -0800

You can already output any number of record within .transform() using
the provided Context object from init()...



-Matthias

On 2/14/17 9:16 AM, Guozhang Wang wrote:
>> and you can't output multiple records or branching logic from a
> transform();
> 
> For output multiple records in transform, we are currently working on
> https://issues.apache.org/jira/browse/KAFKA-4217, I think that should cover
> this use case.
> 
> For branching the output in transform, I agree this is not perfect but I
> think users can follow some patterns like "stream.transform().branch()",
> would that work for you?
> 
> 
> Guozhang
> 
> 
> On Tue, Feb 14, 2017 at 8:29 AM, Mathieu Fenniak <
> mathieu.fenn...@replicon.com> wrote:
> 
>> On Tue, Feb 14, 2017 at 1:14 AM, Guozhang Wang <wangg...@gmail.com> wrote:
>>
>>> Some thoughts on the mixture usage of DSL / PAPI:
>>>
>>> There were some suggestions on mixing the usage of DSL and PAPI:
>>> https://issues.apache.org/jira/browse/KAFKA-3455, and after thinking it
>> a
>>> bit more carefully, I'd rather not recommend users following this
>> pattern,
>>> since in DSL this can always be achieved in process() / transform().
>> Hence
>>> I think it is okay to prevent such patterns in the new APIs. And for the
>>> same reasons, I think we can remove KStreamBuilder#newName() from the
>>> public APIs.
>>>
>>
>> I'm not sure that things can always be achieved by process() /
>> transform()... there are some limitations to these APIs.  You can't output
>> from a process(), and you can't output multiple records or branching logic
>> from a transform(); these are things that can be done in the PAPI quite
>> easily.
>>
>> I definitely understand a preference for using process()/transform() where
>> possible, but, they don't seem to replace the PAPI.
>>
>> I would love to be operating in a world that was entirely DSL.  But the DSL
>> is limited, and it isn't extensible (... by any stable API).  I don't mind
>> reaching into internals today and making my own life difficult to extend
>> it, and I'd continue to find a way to do that if you made the APIs distinct
>> and split, but I'm just expressing my preference that you not do that. :-)
>>
>> And about printing the topology for debuggability: I agrees this is a
>>> potential drawback, and I'd suggest maintain some functionality to build
>> a
>>> "dry topology" as Mathieu suggested; the difficulty is that, internally
>> we
>>> need a different "copy" of the topology for each thread so that they will
>>> not share any states, so we cannot directly pass in the topology into
>>> KafkaStreams instead of the topology builder. So how about adding a
>>> `ToplogyBuilder#toString` function which calls `build()` internally then
>>> prints the built dry topology?
>>>
>>
>> Well, this sounds better than KafkaStreams#toString() in that it doesn't
>> require a running processor.  But I'd really love to have a simple object
>> model for the topology, not a string output, so that I can output my own
>> debug format.  I currently have that in the form of
>> TopologyBuilder#nodeGroups() & TopologyBuilder#build(Integer).
>>
>> Mathieu
>>
> 
> 
>

signature.asc
Description: OpenPGP digital signature

Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API

Reply via email to