Sebastian Kruse created FLINK-2193:
--------------------------------------
Summary: Partial shuffling
Key: FLINK-2193
URL: https://issues.apache.org/jira/browse/FLINK-2193
Project: Flink
Issue Type: Improvement
Reporter: Sebastian Kruse
Priority: Minor
In some cases, it would come in handy to shuffle only some specific elements of
a dataset instead of all elements. This is currently not achievable with a
custom partitioner.
Use cases for such a feature are:
* Load balancing: split up elements that require high processing load and
distribute the splits among all task managers.
* Evolutionary algorithms: A well-suited EA model for Map/Reduce-like platforms
is the island model, where each worker maintains and evolves its own
population. From time to time, individuals among the population need to be
exchanged. Shuffling all the complete populations is not necessary, though.
A presumably easy way to achieve this feature could be to provide the local
partition number in deployed partitioners, similar to
{{RichFunction#getRuntimeContext()#getIndexOfThisSubtask()}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)