[jira] [Resolved] (YUNIKORN-2498) Implement force create flag in k8shim for recovery queue
[ https://issues.apache.org/jira/browse/YUNIKORN-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg resolved YUNIKORN-2498. - Fix Version/s: 1.6.0 Resolution: Fixed > Implement force create flag in k8shim for recovery queue > > > Key: YUNIKORN-2498 > URL: https://issues.apache.org/jira/browse/YUNIKORN-2498 > Project: Apache YuniKorn > Issue Type: Task > Components: shim - kubernetes >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Labels: pull-request-available > Fix For: 1.6.0 > > > As part of the initialisation changes a new recovery queue was added to allow > already running allocation to be restored even if the queue config was > changed. The implementation on the k8shim side needs to be added to leverage > the forced create flag from YUNIKORN-1887. > Without that the changes added for the recovery queue will not be used -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org For additional commands, e-mail: dev-h...@yunikorn.apache.org
[jira] [Created] (YUNIKORN-2526) Discrepancy between shim cache and core app/task list after scheduler restart
Shravan Achar created YUNIKORN-2526: --- Summary: Discrepancy between shim cache and core app/task list after scheduler restart Key: YUNIKORN-2526 URL: https://issues.apache.org/jira/browse/YUNIKORN-2526 Project: Apache YuniKorn Issue Type: Bug Components: shim - kubernetes Reporter: Shravan Achar Attachments: log-snippet.txt, state-dump-4-1-3.json When scheduler restarts, occasionally it gets into a situation where the application is still in Running state despite the application getting terminated in the cluster. This is confirmed with the attached state dump. The scheduler core logs indicate all nodes are being evaluated for non-existing application (also attached). The CPU is being used up doing this unneeded evaluation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org For additional commands, e-mail: dev-h...@yunikorn.apache.org
[jira] [Created] (YUNIKORN-2525) Make dispatcher.Stop() shut down quicker
Peter Bacsko created YUNIKORN-2525: -- Summary: Make dispatcher.Stop() shut down quicker Key: YUNIKORN-2525 URL: https://issues.apache.org/jira/browse/YUNIKORN-2525 Project: Apache YuniKorn Issue Type: Improvement Components: shim - kubernetes Reporter: Peter Bacsko Assignee: Peter Bacsko {{dispatcher.Stop()}} takes sometimes an extra 1 second to shut down properly. This slows down unit tests. On my machine, {{context_test.go}} runs for 19-20 seconds. With some improvements, this can be improved to 1 second. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org For additional commands, e-mail: dev-h...@yunikorn.apache.org