GitHub user tzulitai opened a pull request:

    https://github.com/apache/flink/pull/5885

    [FLINK-8715] Remove usage of StateDescriptor in state handles

    ## What is the purpose of the change
    
    This PR is WIP, and is still lacking test coverage.
    It is opened now to collect some feedback for a proposed solution for 
FLINK-8715.
    
    Previously, reconfigured state serializers on restore were not properly 
forwarded to the state handles. In the past, the `StateDescriptor` served as 
the holder for the reconfigured serializer.
    However, since 88ffad27, `StateDescriptor#getSerializer()` started giving 
out duplicates of the serializer, which caused reconfigured serializers to be a 
completely different copy then what the state handles were using.
    
    This fix corrects this by explicitly forwarding the serializer to the 
instantiated state handles after the state is registered at the state backend. 
It also eliminates the use of `StateDescriptor`s internally in the state 
handles, so that the behaviour is independent of the 
`StateDescriptor#getSerializer()` method's implementation.
    
    The alternative to this approach is to have an internal `setSerializer` 
method on the `StateDescriptor`, which should be used after state serializers 
are reconfigured on registration.
    Then, that assures that handed out serializers by the descriptor are always 
reconfigured, as soon as the descriptor is registered at the backend.
    
    ## Brief change log
    
    - Remove `StateDescriptor`s from heap / RocksDB state handle classes
    - Forwards state serializer and any other necessary information provided by 
the state descriptor (e.g. default value, user functions, nested serializers, 
etc.) when instantiating state handles.
    
    ## Verifying this change
    
    This fix still lacks test coverage.
    It has been opened to collect feedback for the approach.
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (yes / (**no**)
      - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
      - The serializers: (**yes** / no / don't know)
      - The runtime per-record code paths (performance sensitive): (**yes** / 
no / don't know)
      - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
      - The S3 file system connector: (yes / **no** / don't know)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (yes / **no**)
      - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tzulitai/flink FLINK-8715

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/5885.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5885
    
----
commit c092dd6518d9e6f47f4cfc797c18bedc8a89cc05
Author: Tzu-Li (Gordon) Tai <tzulitai@...>
Date:   2018-04-20T13:15:42Z

    [FLINK-8715] Remove usage of StateDescriptor in state handles

----


---

Reply via email to