[ https://issues.apache.org/jira/browse/FLINK-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17321780#comment-17321780 ]
Flink Jira Bot commented on FLINK-1267: --------------------------------------- This issue and all of its Sub-Tasks have not been updated for 180 days. So, it has been labeled "stale-minor". If you are still affected by this bug or are still interested in this issue, please give an update and remove the label. In 7 days the issue will be closed automatically. > Add crossGroup operator > ----------------------- > > Key: FLINK-1267 > URL: https://issues.apache.org/jira/browse/FLINK-1267 > Project: Flink > Issue Type: New Feature > Components: API / DataSet, Runtime / Task > Affects Versions: 0.7.0-incubating > Reporter: Fabian Hueske > Assignee: pietro pinoli > Priority: Minor > Labels: stale-minor > > A common operator is to pair-wise compare or combine all elements of a group > (there were two questions about this on the user mailing list, recently). > Right now, this can be done in two ways: > 1. {{groupReduce}}: consume and store the complete iterator in memory and > build all pairs > 2. do a self-{{Join}}: the engine builds all pairs of the full symmetric > Cartesian product. > Both approaches have drawbacks. The {{groupReduce}} variant requires that the > full group fits into memory and is more cumbersome to implement for the user, > but pairs can be arbitrarily built. The self-{{Join}} approach pushes most of > the work into the system, but the execution strategy does not treat the > self-join different from a regular join (both identical inputs are shuffled, > etc.) and always builds the full symmetric Cartesian product. > I propose to add a dedicated {{crossGroup()}} operator, that offers this > functionality in a proper way. -- This message was sent by Atlassian Jira (v8.3.4#803005)