Matthias, 

Thanks for the feedback. For our use case, we have some complexities that make 
using the existing Streams API more complicated than using the Kafka Consumer 
directly. 

- We are doing async processing, which I don't think is currently available 
(KIP-311 is handling this). 

- Our state has a high eviction rate, so kafka compacted topics are not ideal 
for storing the changelog. The compaction cannot keep up and the topic will be 
majority tombstones when it is read on partition reassignment. We are using a 
KV store the "change log" instead.

- We wanted to separate consumer threads from worker threads to maximize 
parallelization while keeping consumer TCP connections down.

Ultimately, it was much simpler to use the KafkaConsumer directly rather than 
peel away a lot of what Streams API does for you. I think we should continue to 
add support for more complex use cases and processing to the Streams API. 
However, I think there will remain streaming join use cases that can benefit 
from the flexibility that comes from using the KafkaConsumer directly. 

Mike

On 6/20/18, 5:08 PM, "Matthias J. Sax" <matth...@confluent.io> wrote:

    Mike,
    
    thanks a lot for the KIP. I am wondering, why Streams API cannot be used
    for perform the join? Would be good to understand the advantage of
    adding a `StickyStreamJoinAssignor` compared to using Streams API? Atm,
    it seems to be a redundant feature to me.
    
    -Matthias
    
    
    On 6/20/18 1:07 PM, Mike Freyberger wrote:
    > Hi everybody,
    > 
    > I’ve created a proposal document for KIP-315 which outlines the 
motivation of adding a new partition assignment strategy that can used for 
streaming join use cases.
    > 
    > It’d be great to get feedback on the overall idea and the proposed 
implementation.
    > 
    > KIP Link: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-315%3A+Stream+Join+Sticky+Assignor
    > 
    > Thanks,
    > 
    > Mike
    > 
    
    

Reply via email to