[ 
https://issues.apache.org/jira/browse/FLINK-25328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17463590#comment-17463590
 ] 

Shammon edited comment on FLINK-25328 at 12/22/21, 5:42 AM:
------------------------------------------------------------

Thanks [~xtsong], I think it's interesting about pending requests.

I agree that when join/agg operators and RocksDB / Python request more segment 
and they cant allocate segments from memory any more(the total usage of 
segments has reached to the maximum), we can pending these requests until new 
segments are free. I think it's a good improvement about `MemoryManager`.

I wonder if I understand the second point correctly, conversely, when some 
tasks are finished, their free segments should not be deallocated immediately 
even when there're no pending requests. These segments should be reused by the 
later tasks, and we can deallocate them in a periodic time to decrease the 
usage of memory. What do you think? :)


was (Author: zjureel):
Thanks [~xtsong], I think it's interesting about pending requests.

I agree that when join/agg operators and RocksDB / Python request more segment 
and they cant allocate segments from memory any more(the total usage of 
segments has reached to the maximum), we can pending these requests until new 
segments are free. I think it's a good improvement about `MemoryManager`.

I wonder if I understand the second point correctly, conversely, when some 
tasks are finished, their free segments should not be deallocated immediately 
even when there're no pending requests. These segments should be reused by the 
later tasks, what do you think? :)

> Improvement of reuse segments for join/agg/sort operators in TaskManager for 
> flink olap queries
> -----------------------------------------------------------------------------------------------
>
>                 Key: FLINK-25328
>                 URL: https://issues.apache.org/jira/browse/FLINK-25328
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>    Affects Versions: 1.14.0, 1.12.5, 1.13.3
>            Reporter: Shammon
>            Priority: Major
>
>     We submit batch jobs to flink session cluster as olap queries, and these 
> jobs' subtasks in TaskManager are frequently created and destroyed because 
> they finish their work quickly. Each slot in taskmanager manages 
> `MemoryManager` for multiple tasks in one job, and the `MemoryManager` is 
> closed when all the subtasks are finished. Join/Aggregate/Sort and etc. 
> operators in the subtasks allocate `MemorySegment` via `MemoryManager` and 
> these `MemorySegment` will be free when they are finished. 
>     
>     It causes too much memory allocation and free of `MemorySegment` in 
> taskmanager. For example, a TaskManager contains 50 slots, one job has 3 
> join/agg operatos run in the slot, each operator will allocate 2000 segments 
> and initialize them. If the subtasks of a job take 100ms to execute, then the 
> taskmanager will execute 10 jobs' subtasks one second and it will allocate 
> and free 2000 * 3 * 50 * 10 = 300w segments for them. Allocate and free too 
> many segments from memory will cause two issues:
> 1) Increases the CPU usage of taskmanager
> 2) Increase the cost of subtasks in taskmanager, which will increase the 
> latency of job and decrease the qps.
>       To improve the usage of memory segment between jobs in the same slot, 
> we propose not drop memory manager when all the subtasks in the slot are 
> finished. The slot will hold the `MemoryManager` and not free the allocated 
> `MemorySegment` in it immediately. When some subtasks of another job are 
> assigned to the slot, they don't need to allocate segments from memory and 
> can reuse the `MemoryManager` and `MemorySegment` in it.  WDYT?  [~xtsong] THX



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to