[
https://issues.apache.org/jira/browse/FLINK-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14077865#comment-14077865
]
ASF GitHub Bot commented on FLINK-1023:
---------------------------------------
GitHub user StephanEwen opened a pull request:
https://github.com/apache/incubator-flink/pull/84
[FLINK-1023] Switch group-at-a-time function to Iterable (from Iterator)
This patch allows the *GroupReduce* and the *CoGroup* to use the beautiful
*foreach* loop syntax.
```java
public void reduce(Iterable<Long> values, Collector<Long> out)
long sum = 0L;
for (Long num : values) {
sum += num;
}
}
```
Since the data behind the iterable is transient, you cannot iterate over it
multiple times. The next time you request an iterator, it will pick up where
the previous left off, potentially returning an empty iterator:
```java
public void reduce(Iterable<Long> values, Collector<Long> out)
for (Long num : values) {
// do something
}
for (Long num : values) {
// empty loop, will never be entered
}
}
```
Some functions need iterator behavior, which can still be used
```java
public void coGroup(Iterable<Long> values1, Iterable<Long> values2,
Collector<Long> out)
if (values2.iterator().hasNext()) {
// do something
}
}
public void coGroup(Iterable<Long> values1, Iterable<Long> values2,
Collector<Long> out)
Iterator<Long> iter = values2.iterator();
if (values2.iterator().hasNext()) {
do {
// something
} while (iter.hasNext());
}
}
```
----
Note:
The *Iterable* and the *Iterator* are currently strictly in sync. That
means one can pick up where the other left and vice versa. While I would not
encourage it to use this pattern, you can actually mix then and do something
like the following (which is equivalent to the first example):
```java
public void reduce(Iterable<Long> values, Collector<Long> out)
Long sum = values.iterator().next();
for (Long num : values) {
sum += num;
}
}
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/StephanEwen/incubator-flink iterable
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-flink/pull/84.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #84
----
commit d4f2a65fc4faa013181ceabcc8e3c0975c99156e
Author: Stephan Ewen <[email protected]>
Date: 2014-07-29T13:58:44Z
[FLINK-1023] Switch group-at-a-time function to java.lang.Iterable (from
java.util.Iterator)
----
> Provide Iterable instead of Iterator to grouped functions
> ---------------------------------------------------------
>
> Key: FLINK-1023
> URL: https://issues.apache.org/jira/browse/FLINK-1023
> Project: Flink
> Issue Type: Wish
> Components: Java API
> Reporter: Ufuk Celebi
> Priority: Trivial
> Labels: breaking-api
>
> I would like the grouped functions to provide an Iterable instead of an
> Iterator to the user, e.g. for the {{reduce}} method of
> {{GroupReduceFunction}}.
> We had a discussion about this previously (I couldn't find the respective
> issues/list threads right now) and the result was in favor of the change.
> We never got around to really push for it, because of the API break. With the
> renaming, it should be less of an issue.
--
This message was sent by Atlassian JIRA
(v6.2#6252)