[
https://issues.apache.org/jira/browse/YUNIKORN-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18065689#comment-18065689
]
Ashwin Shroff commented on YUNIKORN-3239:
-----------------------------------------
relevant thread on yk-users slack channel
https://yunikornworkspace.slack.com/archives/CLNUW68MU/p1773130145832729
> Lag is processing register node event in yunikorn core
> ------------------------------------------------------
>
> Key: YUNIKORN-3239
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3239
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: core - scheduler
> Reporter: Ashwin Shroff
> Priority: Major
> Attachments: stack-trace.json
>
>
> We are running yunikorn-core + yunikorn k8s shim. We observe long delays in
> yunikorn processing node update events.
> As shown in this stack trace, shim is waiting for more than 53 min to get an
> ack from core for a node update event. Also have attached the stack trace
> collected
> goroutine 2976 [sync.WaitGroup.Wait, 53 minutes]:
> sync.runtime_SemacquireWaitGroup(0xc000056a20?) runtime/sema.go:110 +0x25
> sync.(*WaitGroup).Wait(0xc000056a20?) sync/waitgroup.go:118 +0x48
> github.com/apache/yunikorn-k8shim/pkg/cache.(*Context).registerNodesInternal(0xc00086f830,
> \{0xc00039ce48, 0x1, 0x1}, 0xc09be3bd40)
> github.com/apache/yunikorn-k8shim/pkg/cache/context.go:1605 +0x58c
> github.com/apache/yunikorn-k8shim/pkg/cache.(*Context).registerNodes(0xc00086f830,
> \{0xc082d6bad0, 0x1, 0xc000280690?})
> github.com/apache/yunikorn-k8shim/pkg/cache/context.go:1538 +0x4c5
> github.com/apache/yunikorn-k8shim/pkg/cache.(*Context).registerNode(0xc000280690?,
> 0xc006e99b08) github.com/apache/yunikorn-k8shim/pkg/cache/context.go:1499
> +0x2e
> github.com/apache/yunikorn-k8shim/pkg/cache.(*Context).updateNodeInternal(0xc00086f830,
> 0xc006e99b08, 0x1)
> github.com/apache/yunikorn-k8shim/pkg/cache/context.go:187 +0x79
> github.com/apache/yunikorn-k8shim/pkg/cache.(*Context).updateNode(0xc00086f830,
> \{0xc007aea000?, 0xc016770c08?}, \{0x2509680, 0xc006e99b08})
> github.com/apache/yunikorn-k8shim/pkg/cache/context.go:176 +0x291
> github.com/apache/yunikorn-k8shim/pkg/cache.(*Context).addNode(...)
> github.com/apache/yunikorn-k8shim/pkg/cache/context.go:165
> k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(...)
> k8s.io/[email protected]/tools/cache/controller.go:246
> k8s.io/client-go/tools/cache.(*processorListener).run.func1()
> k8s.io/[email protected]/tools/cache/shared_informer.go:978 +0xa9
> k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc000a80008?)
> k8s.io/[email protected]/pkg/util/wait/backoff.go:226 +0x33
> k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc082d6bf70, \{0x28a6800,
> 0xc0409f8120}, 0x1, 0xc01434b340)
> k8s.io/[email protected]/pkg/util/wait/backoff.go:227 +0xaf
> k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc010b79770, 0x3b9aca00, 0x0,
> 0x1, 0xc01434b340) k8s.io/[email protected]/pkg/util/wait/backoff.go:204
> +0x7f k8s.io/apimachinery/pkg/util/wait.Until(...)
> k8s.io/[email protected]/pkg/util/wait/backoff.go:161
> k8s.io/client-go/tools/cache.(*processorListener).run(0xc01b59afc0)
> k8s.io/[email protected]/tools/cache/shared_informer.go:972 +0x5a
> k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1()
> k8s.io/[email protected]/pkg/util/wait/wait.go:72 +0x4c created by
> k8s.io/apimachinery/pkg/util/wait.(*Group).Start in goroutine 1
> k8s.io/[email protected]/pkg/util/wait/wait.go:70 +0x73
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]