[ 
https://issues.apache.org/jira/browse/FLINK-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber updated FLINK-6046:
-------------------------------
    Component/s: Network

> Add support for oversized messages during deployment
> ----------------------------------------------------
>
>                 Key: FLINK-6046
>                 URL: https://issues.apache.org/jira/browse/FLINK-6046
>             Project: Flink
>          Issue Type: New Feature
>          Components: Distributed Coordination, Network
>            Reporter: Nico Kruber
>            Assignee: Nico Kruber
>
> This is the non-FLIP6 version of FLINK-4346, restricted to deployment 
> messages:
> Currently, messages larger than the maximum Akka Framesize cause an error 
> when being transported. We should add a way to pass messages that are larger 
> than {{akka.framesize}} as may happen for task deployments via the 
> {{TaskDeploymentDescriptor}}.
> We should use the {{BlobServer}} to offload big data items (if possible) and 
> make use of any potential distributed file system behind. This way, not only 
> do we avoid the akka framesize restriction, but may also be able to speed up 
> deployment.
> I suggest the following changes:
>   - the sender, i.e. the {{Execution}} class, tries to store the serialized 
> job information and serialized task information (if oversized) from the 
> {{TaskDeploymentDescriptor}} (tdd) on the {{BlobServer}} as a single 
> {{NAME_ADDRESSABLE}} blob under its job ID (if this does not work, we send 
> the whole tdd as usual via akka)
>   - if stored in a blob, these data items are removed from the tdd
>   - the receiver, i.e. the {{TaskManager}} class, tries to retrieve any 
> offloaded data after receiving the {{TaskDeploymentDescriptor}} from akka; it 
> re-assembles the original tdd
>   - the stored blob may be deleted after re-assembly of the tdd
> Further (future) changes may include:
>   - separating the serialized job information and serialized task information 
> into two files and re-use the first one for all tasks
>   - not re-deploying these two during job recovery (if possible)
>   - then, as all other {{NAME_ADDRESSABLE}} blobs, these offloaded blobs may 
> be removed when the job enters a final state instead



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to