[
https://issues.apache.org/jira/browse/FLINK-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhijiang Wang updated FLINK-5131:
---------------------------------
Description:
Normally the UDF just creates short-life small objects and these can be
recycled quickly by JVM, so most of the memory resource is controlled and
managed by *TaskManager* framework. But for some special cases, the UDF may
consume much resource to create long-live big objects, so it is necessary to
provide the options for professional users to define the resource usages if
needed.
The basic approach is the following:
- Introduce the *ResourceSpec* structure to describe the different resource
factors (cpu cores, heap memory, direct memory, native memory, etc) and provide
some basic construction methods for resource group.
- The *ResourceSpec* can be setted onto the internal transformation in
DataStream and base operator in DataSet separately.
- In stream graph generation, the *ResourceSpec* will be aggregated for
chained operators.
- When *JobManager* requests slot for submitting task from *ResourceManager*,
the *ResourceProfile* will be expanded to correspondence with *ResourceSpec*.
- The *ResourceManager* requests resource for container from cluster, it
should consider extra framework memory except for slot *ResourceProfile*.
- The framework memory is mainly used by *NetworkBufferPool* and
*MemoryManager* in *TaskManager*, and it can be configured in job level.
- Apart from resource, The JVM options attached with container should be
supported and could also be configured in job level.
This feature will be implemented directly into flip-6 branch.
was:
Normally the UDF just creates short-life small objects and these can be
recycled quickly by JVM, so most of the memory resource is controlled and
managed by *TaskManager* framework. But for some special cases, the UDF may
consume much resource to create long-live big objects, so it is necessary to
provide the options for professional users to define the resource usages if
needed.
The basic approach is the following:
- Introduce the *ResourceSpec* structure to describe the different resource
factors (cpu cores, heap memory, direct memory, native memory, etc) and provide
some basic construction methods for resource group.
- The *ResourceSpec* can be setted onto the internal transformation in
DataStream and base operator in DataSet separately.
- In stream graph generation, the *ResourceSpec* will be aggregated for
chained operators.
- When *JobManager* requests slot for submitting task from *ResourceManager*,
the *ResourceProfile* will be expanded to correspondence with *ResourceSpec*.
- The *ResourceManager* requests resource for container from cluster, it
should consider extra framework memory except for slot *ResourceProfile*.
- The framework memory is mainly used by *NetworkBufferPool* and
*MemoryManager* in *TaskManager*, and it can be configured in job level.
- Apart from resource, The JVM options attached with container should be
supported and could also be configured in job level.
> Fine-grained Resource Configuration
> -----------------------------------
>
> Key: FLINK-5131
> URL: https://issues.apache.org/jira/browse/FLINK-5131
> Project: Flink
> Issue Type: New Feature
> Components: DataSet API, DataStream API, JobManager, ResourceManager
> Reporter: Zhijiang Wang
>
> Normally the UDF just creates short-life small objects and these can be
> recycled quickly by JVM, so most of the memory resource is controlled and
> managed by *TaskManager* framework. But for some special cases, the UDF may
> consume much resource to create long-live big objects, so it is necessary to
> provide the options for professional users to define the resource usages if
> needed.
> The basic approach is the following:
> - Introduce the *ResourceSpec* structure to describe the different resource
> factors (cpu cores, heap memory, direct memory, native memory, etc) and
> provide some basic construction methods for resource group.
> - The *ResourceSpec* can be setted onto the internal transformation in
> DataStream and base operator in DataSet separately.
> - In stream graph generation, the *ResourceSpec* will be aggregated for
> chained operators.
> - When *JobManager* requests slot for submitting task from
> *ResourceManager*, the *ResourceProfile* will be expanded to correspondence
> with *ResourceSpec*.
> - The *ResourceManager* requests resource for container from cluster, it
> should consider extra framework memory except for slot *ResourceProfile*.
> - The framework memory is mainly used by *NetworkBufferPool* and
> *MemoryManager* in *TaskManager*, and it can be configured in job level.
> - Apart from resource, The JVM options attached with container should be
> supported and could also be configured in job level.
> This feature will be implemented directly into flip-6 branch.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)