Github user StephanEwen commented on a diff in the pull request:
https://github.com/apache/incubator-flink/pull/42#discussion_r14509280
--- Diff:
stratosphere-java/src/main/java/eu/stratosphere/api/java/DataSet.java ---
@@ -135,6 +139,27 @@ public ExecutionEnvironment getExecutionEnvironment() {
}
return new MapOperator<T, R>(this, mapper);
}
+
+
+
+ /**
+ * Applies a Map transformation on a {@link DataSet} by using an
iterator.<br/>
--- End diff --
I think this comment is not quite correct. Something more appropriate is
```
Applies a Map operation to the entire partition of the data. The function
is called once per parallel partition of the data, and the entire partition is
available through the given Iterator. The number of elements that each instance
of the MapPartition function sees is non deterministic and depends on the
degree of parallelism of the operation.
This function is intended for operations that cannot transform individual
elements, requires no grouping of elements. To transform individual elements,
the use of {@code map()} and {@code flatMap()} is preferable."
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---