[ https://issues.apache.org/jira/browse/FLINK-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Konstantin Knauf updated FLINK-2193: ------------------------------------ Labels: (was: auto-closed) Priority: Not a Priority (was: Minor) > Partial shuffling > ----------------- > > Key: FLINK-2193 > URL: https://issues.apache.org/jira/browse/FLINK-2193 > Project: Flink > Issue Type: Improvement > Components: API / DataSet > Reporter: Sebastian Kruse > Priority: Not a Priority > > In some cases, it would come in handy to shuffle only some specific elements > of a dataset instead of all elements. This is currently not achievable with a > custom partitioner. > Use cases for such a feature are: > * Load balancing: split up elements that require high processing load and > distribute the splits among all task managers. > * Evolutionary algorithms: A well-suited EA model for Map/Reduce-like > platforms is the island model, where each worker maintains and evolves its > own population. From time to time, individuals among the population need to > be exchanged. Shuffling all the complete populations is not necessary, though. > A presumably easy way to achieve this feature could be to provide the local > partition number in deployed partitioners, similar to > {{RichFunction#getRuntimeContext()#getIndexOfThisSubtask()}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)