[ 
https://issues.apache.org/jira/browse/FLINK-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Knauf reopened FLINK-685:
------------------------------------

Re-opening in accordance with https://issues.apache.org/jira/browse/FLINK-23206.

> Add support for semi-joins
> --------------------------
>
>                 Key: FLINK-685
>                 URL: https://issues.apache.org/jira/browse/FLINK-685
>             Project: Flink
>          Issue Type: New Feature
>          Components: API / DataSet
>            Reporter: GitHub Import
>            Assignee: pietro pinoli
>            Priority: Minor
>              Labels: auto-closed, github-import, stale-assigned
>
> A semi-join is basically a join filter. One input is "filtering" and the 
> other one is "filtered".
> A tuple of the "filtered" input is emitted exactly once if the "filtering" 
> input has one (ore more) tuples with matching join keys. That means that the 
> output of a semi-join has the same type as the "filtered" input and the 
> "filtering" input is completely discarded.
> In order to support a semi-join, we need to add an additional physical 
> execution strategy, that ensures, that a tuple of the "filtered" input is 
> emitted only once if the "filtering" input has more than one tuple with 
> matching keys. Furthermore, a couple of optimizations compared to standard 
> joins can be done such as storing only keys and not the full tuple of the 
> "filtering" input in a hash table.
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/685
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Milestone: Release 0.6 (unplanned)
> Created at: Mon Apr 14 12:05:29 CEST 2014
> State: open



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to