[ 
https://issues.apache.org/jira/browse/KAFKA-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-3705:
-----------------------------------
    Description: 
KIP-213: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable]

Today in Kafka Streams DSL, KTable joins are only based on keys. If users want 
to join a KTable A by key {{a}} with another KTable B by key {{b}} but with a 
"foreign key" {{a}}, and assuming they are read from two topics which are 
partitioned on {{a}} and {{b}} respectively, they need to do the following 
pattern:
{code:java}
tableB' = tableB.groupBy(/* select on field "a" */).agg(...); // now tableB' is 
partitioned on "a"

tableA.join(tableB', joiner);
{code}
Even if these two tables are read from two topics which are already partitioned 
on {{a}}, users still need to do the pre-aggregation in order to make the two 
joining streams to be on the same key. This is a draw-back from programability 
and we should fix it.

  was:
Today in Kafka Streams DSL, KTable joins are only based on keys. If users want 
to join a KTable A by key {{a}} with another KTable B by key {{b}} but with a 
"foreign key" {{a}}, and assuming they are read from two topics which are 
partitioned on {{a}} and {{b}} respectively, they need to do the following 
pattern:

{code}
tableB' = tableB.groupBy(/* select on field "a" */).agg(...); // now tableB' is 
partitioned on "a"

tableA.join(tableB', joiner);
{code}

Even if these two tables are read from two topics which are already partitioned 
on {{a}}, users still need to do the pre-aggregation in order to make the two 
joining streams to be on the same key. This is a draw-back from programability 
and we should fix it.


> Support non-key joining in KTable
> ---------------------------------
>
>                 Key: KAFKA-3705
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3705
>             Project: Kafka
>          Issue Type: New Feature
>          Components: streams
>            Reporter: Guozhang Wang
>            Priority: Major
>              Labels: api, kip
>
> KIP-213: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable]
> Today in Kafka Streams DSL, KTable joins are only based on keys. If users 
> want to join a KTable A by key {{a}} with another KTable B by key {{b}} but 
> with a "foreign key" {{a}}, and assuming they are read from two topics which 
> are partitioned on {{a}} and {{b}} respectively, they need to do the 
> following pattern:
> {code:java}
> tableB' = tableB.groupBy(/* select on field "a" */).agg(...); // now tableB' 
> is partitioned on "a"
> tableA.join(tableB', joiner);
> {code}
> Even if these two tables are read from two topics which are already 
> partitioned on {{a}}, users still need to do the pre-aggregation in order to 
> make the two joining streams to be on the same key. This is a draw-back from 
> programability and we should fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to