[GitHub] [flink] StephanEwen opened a new pull request #13239: [FLINK-19021][network] Various cleanups and refactorings of the ResultPartition

GitBox Tue, 25 Aug 2020 03:21:58 -0700


StephanEwen opened a new pull request #13239:
URL: https://github.com/apache/flink/pull/13239



   ## What is the purpose of the change
   
   This is a series of cleanups and refactorings with the ultimate goal to 
prepare the `ResultPartition` interface for a possible switch to be 
*"record-oriented"* rather than *"buffer-oriented with pre-partitioned 
buffers"* as it is now.
   
   A *record-oriented* `ResultPartition` is more flexible and supports, for 
example, implementations like a sorted/clustered shuffle where records are 
gathered in a joint buffer, then sorted/clustered by partition and then written 
to a joint file. This in turn helps reduce memory and file requirements for 
large-scale  shuffles.
   
   
   ## Brief change log
   
     - 5ea80cb8051c87658a604588c41a83349f061f67 removes the behavior of the 
SpanningRecordSerializer to prune its internal serialization buffer under 
special circumstances.
       Previously, the buffer was pruned when:
         - The buffer becomes larger than a certain threshold (5MB)
         - The full record end lines up exactly with a full buffer length (this 
change got introduced at some point, it is not clear what the purpose is)
   
       This optimization virtually never kicks in (because of the second 
condition) and also is unnecessary. There is only a single serializer on the 
sender side, so this will not help to reduce the maximum memory footprint 
needed in any way.
   
     - e41b74d83c44abc9b0f4511156d896455ff7a9dc removes the `releaseMemory()` 
call in the `ResultSubpartition`. There is currently no implementation of this 
(it is only there because of past (and possible future) implementations)
   
       Future versions where memory may have to be released will quite possibly 
not implement that on a `ResultPartition` level, rather than on a subpartition 
level. For example, a sort based shuffle has the buffers on a `ResultPartition` 
level.
   
     - d7c17e269b28f562b0264fe1574894c2099f0fa6 removes the config option 
`taskmanager.network.partition.force-release-on-consumption`
   
       This option was a fallback/safeguard option introduced when the 
fine-grained failover for batch was added. With the fine-grained batch 
failover, the batch result partitions were no longer immediately cleaned up, 
but retained for recovery until the scheduler decided that they were no longer 
needed. The purpose of the flag was to restore the old "immediate cleanup" 
behavior should it be needed.
   
       Because the batch failover has proven stable and this fallback option 
can be removed now.
   
     - cd0374117bf0d5e5e2bf2172a02e2b302fcc3bcc introduces separate classes for 
different ResultPartition types
   
       Given that the partitions behave differently regarding checkpoints, 
releasing, etc. the code is cleaner separated by introducing dedicated classes 
for the `ResultPartition` based on the type.
       **This is also an important preparation to later have more different 
implementations, like sort-based shuffles.**
   
       _Important:_ These new classes will not override any performance 
critical methods (like adding a buffer to the result). They merely specialize 
certain behaviors around checkpointing and cleanup.
   
     - b36f28884f445d81f9903fbc8e930295130136be moves the unaligned checkpoint 
methods from `ResultPartition` to a separate interface.
   
       This improves the problem that `ResultPartitions` have the unaligned 
checkpointing methods, but some do not support checkpoints and throw an 
UnsupportedOperationException. This segregates the interfaces and puts the 
methods relating to unaligned checkpoints in a dedicated interface.
   
   
   ## Verifying this change
   
   This change is only refactoring and cleanup, it does not change behavior. 
The Unit Tests are adjusted.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): **no**
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: **no**
     - The serializers: **no**
     - The runtime per-record code paths (performance sensitive): **no**
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: **no**
     - The S3 file system connector: **no**
   
   ## Documentation
   
     - Does this pull request introduce a new feature? **no**
     - If yes, how is the feature documented? **not applicable**
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] StephanEwen opened a new pull request #13239: [FLINK-19021][network] Various cleanups and refactorings of the ResultPartition

Reply via email to