Lijie Wang created FLINK-25034:
----------------------------------

             Summary: Support flexible number of subpartitions in 
IntermediateResultPartition
                 Key: FLINK-25034
                 URL: https://issues.apache.org/jira/browse/FLINK-25034
             Project: Flink
          Issue Type: Sub-task
          Components: Runtime / Coordination
            Reporter: Lijie Wang


Currently, when a task is deployed, it needs to know the parallelism of its 
consumer job vertex. This is because the consumer vertex parallelism is needed 
to decide the _numberOfSubpartitions_ of _PartitionDescriptor_ which is part of 
the {_}ResultPartitionDeploymentDescriptor{_}. The reason behind that is, at 
the moment, for one result partition, different subpartitions serve different 
consumer execution vertices. More specifically, one consumer execution vertex 
only consumes data from subpartition with the same index. 

Considering a dynamic graph, the parallelism of a job vertex may not have been 
decided when its upstream vertices are deployed. To enable Flink to work in 
this case, we need a way to allow an execution vertex to run without knowing 
the parallelism of its consumer job vertices. One basic idea is to enable 
multiple subpartitions in one result partition to serve the same consumer 
execution vertex.

To achieve this goal, we can set the number of subpartitions to be the *max 
parallelism* of the consumer job vertex. When the consumer vertex is deployed, 
it should be assigned with a subpartition range to consume.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to