[
https://issues.apache.org/jira/browse/FLINK-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153365#comment-14153365
]
Stephan Ewen commented on FLINK-100:
------------------------------------
This is along the lines of a Co-Partition-Map
> Pact API Proposal: Add keyless CoGroup (send all to a single group)
> -------------------------------------------------------------------
>
> Key: FLINK-100
> URL: https://issues.apache.org/jira/browse/FLINK-100
> Project: Flink
> Issue Type: Improvement
> Reporter: GitHub Import
> Labels: github-import
> Fix For: pre-apache
>
>
> I propose to add a keyless version of CoGroup that groups both inputs in a
> single group, analogous to the keyless Reducer version that was added in
> https://github.com/dimalabs/ozone/pull/61
> ```
> CoGroupContract myCoGroup = CoGroupContract.builder(MyUdf.class)
> .input1(contractA)
> .input2(contractB)
> .build();
> ```
> I have a use case where I need to process the output of two contracts in a
> single udf and I currently have to use the workaround to add a constant field
> and use this as grouping key.
> Adding a keyless version would reduce the overhead (network traffic,
> serialization and code-writing) and give the compiler additional knowledge
> (The compiler knows that there will be only a single group and a single udf
> call. If setAvgRecordsEmittedPerStubCall is set, it could infer the output
> cardinality)
> Furthermore I think this would be consequent, because CoGroup is like Reduce
> for multiple inputs.
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/100
> Created by: [andrehacker|https://github.com/andrehacker]
> Labels: enhancement,
> Created at: Sat Sep 14 23:15:59 CEST 2013
> State: open
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)