rkhachatryan commented on pull request #12292:
URL: https://github.com/apache/flink/pull/12292#issuecomment-633727252


   
   Thanks for the feedback @pnowojski ,
   
   I've addressed the issues (except [this 
one](https://github.com/apache/flink/pull/12292#discussion_r429887132)).
   
   Answering your question:
   > Could you elaborate a bit more? What's the alternative? How would it avoid 
more data duplication? Are we still duplicating data with this PR?
   
   Current structure is the following (this PR doesn't change it):
   ```
   Each subtask reports to JM TaskStateSnapshot, 
       each with zero ore more OperatorSubtaskState,
           each with zero or more InputChannelStateHandle and 
ResultSubpartitionStateHandle
               each referencing an underlying StreamStateHandle
   ```
   The underlying `StreamStateHandle` duplicates filename 
(`ByteStreamStateHandle` has it too at least because of `equals/hashcode` I 
guess).
   
   An alternative would be something like 
   ```
   Each subtask reports to JM TaskStateSnapshot, 
       each with zero ore more OperatorSubtaskState,
           each with zero or one StreamStateHandle for channel state handles
               each with zero or more InputChannelStateHandle and 
ResultSubpartitionStateHandle
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to