[jira] [Commented] (YUNIKORN-1724) Improve the performance of shim side scheduling cycle

Peter Bacsko (Jira) Sat, 06 May 2023 05:35:05 -0700


    [ 
https://issues.apache.org/jira/browse/YUNIKORN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17720209#comment-17720209
 ]


Peter Bacsko commented on YUNIKORN-1724:
----------------------------------------

[~wwei] the profiler shows {{FSM.Current()}} as the culprit. It can be 
expensive to call {{RLock()}} n*10000 times in a loop. We perform this twice 
each time we schedule. With few applications and lots of pods, sorting it's not 
a big deal if it's performed a few times per second. But I updated my PR with 
BTree so it's no longer a problem either.

> Improve the performance of shim side scheduling cycle
> -----------------------------------------------------
>
>                 Key: YUNIKORN-1724
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1724
>             Project: Apache YuniKorn
>          Issue Type: Sub-task
>          Components: shim - kubernetes
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: getNewTasks.png
>
>
> Performance testing of Yunikorn uncovered that a lot of time is spent in 
> {{Application.Schedule()}} in the shim. The problem is related to the fact 
> that we collect task objects based on their state which is maintained by 
> {{{}fsm.FSM{}}}. Even though we run {{Application.Schedule()}} once per 
> second, it's still an issue due to the large number of {{RWMutex.RLock()}} 
> calls. With a lot of pods, this consumes significant amount of CPU time.
> Also, different code paths are affected:
> The first is inside the switch-case part in {{{}Schedule(){}}}. We want to 
> know the number of tasks in "New" state and we end up scanning all task 
> objects for their status. 
> The second is retrieving the "New" tasks from {{taskMap}} structure. This is 
> done by {{GetNewTasks()}} / {{{}getTasks(){}}}, copying tasks based on their 
> respective state to a new slice.
> To speed things up, we have to track the "New" tasks in a new map which is 
> dynamically maintained when a new task added and when it leaves the New state 
> (or the task gets removed). Knowing how many tasks we have also becomes 
> trivial and won't require slice iteration/filtering.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org

[jira] [Commented] (YUNIKORN-1724) Improve the performance of shim side scheduling cycle

Reply via email to