[ https://issues.apache.org/jira/browse/KAFKA-6049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758710#comment-16758710 ]
Matthias J. Sax commented on KAFKA-6049: ---------------------------------------- Feature freeze deadline for 2.2 was yesterday. Moving this 2.3 release. > Kafka Streams: Add Cogroup in the DSL > ------------------------------------- > > Key: KAFKA-6049 > URL: https://issues.apache.org/jira/browse/KAFKA-6049 > Project: Kafka > Issue Type: New Feature > Components: streams > Reporter: Guozhang Wang > Priority: Major > Labels: api, kip, user-experience > > When multiple streams aggregate together to form a single larger object (e.g. > a shopping website may have a cart stream, a wish list stream, and a > purchases stream. Together they make up a Customer), it is very difficult to > accommodate this in the Kafka-Streams DSL: it generally requires you to group > and aggregate all of the streams to KTables then make multiple outer join > calls to end up with a KTable with your desired object. This will create a > state store for each stream and a long chain of ValueJoiners that each new > record must go through to get to the final object. > Creating a cogroup method where you use a single state store will: > * Reduce the number of gets from state stores. With the multiple joins when a > new value comes into any of the streams a chain reaction happens where the > join processor keep calling ValueGetters until we have accessed all state > stores. > * Slight performance increase. As described above all ValueGetters are called > also causing all ValueJoiners to be called forcing a recalculation of the > current joined value of all other streams, impacting performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)