[ https://issues.apache.org/jira/browse/KAFKA-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15244674#comment-15244674 ]
ASF GitHub Bot commented on KAFKA-3337: --------------------------------------- GitHub user mjsax opened a pull request: https://github.com/apache/kafka/pull/1231 KAFKA-3337: [WIP] Extract selector as a separate groupBy operator for KTable aggregations You can merge this pull request into a Git repository by running: $ git pull https://github.com/mjsax/kafka kafka-3337-extact-key-selector-from-agg Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/1231.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1231 ---- commit 8b6e3f6b9097ae78e1737bf1fadd3647d8a20a5d Author: Matthias J. Sax <matth...@confluent.io> Date: 2016-04-17T12:24:29Z KAFKA-3337: [WIP] Extract selector as a separate groupBy operator for KTable aggregations ---- > Extract selector as a separate groupBy operator for KTable aggregations > ----------------------------------------------------------------------- > > Key: KAFKA-3337 > URL: https://issues.apache.org/jira/browse/KAFKA-3337 > Project: Kafka > Issue Type: Sub-task > Components: streams > Reporter: Guozhang Wang > Assignee: Matthias J. Sax > Labels: api, newbie++ > Fix For: 0.10.0.0 > > > Currently KTable aggregation takes a selector used for selecting the > aggregate key.and an aggregator for aggregating the values with the same > selected key, which makes the function a little bit "heavy": > {code} > table.groupBy(initializer, adder, substractor, selector, /* optional serde*/); > {code} > It is better to extract the selector in a separate groupBy function such that > {code} > KTableGrouped KTable#groupBy(selector); > KTable KTableGrouped#aggregate(initializer, adder, substractor, /* optional > serde*/); > {code} > Note that "KTableGrouped" only have APIs for aggregate and reduce, and none > else. So users have to follow the pattern below: > {code} > table.groupBy(...).aggregate(...); > {code} > This pattern is more natural for users who are familiar with SQL / Pig or > Spark DSL, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)