[GitHub] [flink] tillrohrmann commented on pull request #13648: [FLINK-19632] Introduce a new ResultPartitionType for Approximate Local Recovery

GitBox Wed, 28 Oct 2020 02:18:24 -0700


tillrohrmann commented on pull request #13648:
URL: https://github.com/apache/flink/pull/13648#issuecomment-717803324



   Sorry for joining the discussion so late but a couple of questions came up 
when discussing scheduler changes with Yuan offline. I wanted to ask why we 
need a special `ResultPartitionType` for the approximate local recovery? 
Shouldn't it be conceptually possible that we support the normal and 
approximative recovery behaviour with the same pipelined partitions? If we say 
that we can reconnect to every pipelined result partition (including dropping 
partially consumed results), then it can be the responsibility of the scheduler 
to make sure that producers are restarted as well in order to ensure 
exactly/at-least once processing guarantees. If not, then we would simply 
consume from where we have left off.
   
   As far as I understand the existing 
`ResultPartitionType.PIPELINED(_BOUNDED)` cannot be used because we release the 
result partition if the downstream consumer disconnects. I believe that this is 
not a strict contract of pipelined result partitions but more of an 
implementation artefact. Couldn't we solve the problem of disappearing 
pipelined result partitions by binding the lifecyle of a pipelined result 
partition to the lifecycle of a `Task`? We could say that a `Task` can only 
terminate once the pipelined result partition has been consumed. Moreover, a 
`Task` will clean up the result partition if it fails or gets canceled. That 
way, we have a clearly defined lifecycle and make sure that these results get 
cleaned up (iff the `Task` reaches a terminal state).
   
   I would love to hear your feedback @pnowojski, @zhijiangW and @rkhachatryan 
and also learn more about you reasoning to introduce a new result partition 
type.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] tillrohrmann commented on pull request #13648: [FLINK-19632] Introduce a new ResultPartitionType for Approximate Local Recovery

Reply via email to