[ 
https://issues.apache.org/jira/browse/YUNIKORN-793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg closed YUNIKORN-793.
------------------------------------------
     Fix Version/s: 1.0.0
    Target Version: 1.0.0
        Resolution: Fixed

Change committed

Jira linked to the original Jira that introduced the new code.

> fix deadlock caused by listing queues with scheduling pending pods
> ------------------------------------------------------------------
>
>                 Key: YUNIKORN-793
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-793
>             Project: Apache YuniKorn
>          Issue Type: Bug
>            Reporter: Chia-Ping Tsai
>            Assignee: Chia-Ping Tsai
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.0.0
>
>
> `GetPartitionQueues` calls read lock multiple times. If there is a thread 
> which is waiting write lock, it can exclude all new read locks. In short, the 
> following execution order can cause dead lock.
> 1. hold read lock ---> thread 0
>  2. wait write lock ---> thread 1 is locked by thread 0
>  3. acquire read lock ---> thread 0 is locked by thread 1
> see docs for more details ([https://pkg.go.dev/sync#RWMutex])
> The pprof is shown below.
> {noformat}
> 1 @ 0x43ada5 0x44ca85 0x44ca6e 0x46de27 0x47ce25 0x47e590 0x47e522 0x9c5753 
> 0x9ad231 0x9fbcfa 0x9e4986 0x9e4851 0x9de655 0x9ff792 0x471c61
> #     0x46de26        sync.runtime_SemacquireMutex+0x46                       
>                                                 
> /Users/chia7712/Library/go/default/src/runtime/sema.go:71
> #     0x47ce24        sync.(*Mutex).lockSlow+0x104                            
>                                                 
> /Users/chia7712/Library/go/default/src/sync/mutex.go:138
> #     0x47e58f        sync.(*Mutex).Lock+0x8f                                 
>                                                 
> /Users/chia7712/Library/go/default/src/sync/mutex.go:81
> #     0x47e521        sync.(*RWMutex).Lock+0x21                               
>                                                 
> /Users/chia7712/Library/go/default/src/sync/rwmutex.go:111
> #     0x9c5752        
> github.com/apache/incubator-yunikorn-core/pkg/scheduler/objects.(*Queue).incPendingResource+0x52
>         
> /Users/chia7712/go/pkg/mod/github.com/chia7712/incubator-yunikorn-core@v0.0.0-20210811001640-eaa6afb10b62/pkg/scheduler/objects/queue.go:454
> 1 @ 0x43ada5 0x44ca85 0x44ca6e 0x46de27 0x9c7eae 0x9c7e34 0x9c51f8 0x9c54ab 
> 0xa59e45 0xa59e94 0x711024 0xa5b310 0x711024 0xa011b3 0x7145e3 0x70fb0d 
> 0x471c61
> #     0x46de26        sync.runtime_SemacquireMutex+0x46                       
>                                                 
> /Users/chia7712/Library/go/default/src/runtime/sema.go:71
> #     0x9c7ead        sync.(*RWMutex).RLock+0xad                              
>                                                 
> /Users/chia7712/Library/go/default/src/sync/rwmutex.go:63
> #     0x9c7e33        
> github.com/apache/incubator-yunikorn-core/pkg/scheduler/objects.(*Queue).IsLeafQueue+0x33
>                
> /Users/chia7712/go/pkg/mod/github.com/chia7712/incubator-yunikorn-core@v0.0.0-20210811001640-eaa6afb10b62/pkg/scheduler/objects/queue.go:667
> #     0x9c51f7        
> github.com/apache/incubator-yunikorn-core/pkg/scheduler/objects.(*Queue).GetPartitionQueues+0x1f7
>        
> /Users/chia7712/go/pkg/mod/github.com/chia7712/incubator-yunikorn-core@v0.0.0-20210811001640-eaa6afb10b62/pkg/scheduler/objects/queue.go:426
> #     0x9c54aa        
> github.com/apache/incubator-yunikorn-core/pkg/scheduler/objects.(*Queue).GetPartitionQueues+0x4aa
>        
> /Users/chia7712/go/pkg/mod/github.com/chia7712/incubator-yunikorn-core@v0.0.0-20210811001640-eaa6afb10b62/pkg/scheduler/objects/queue.go:416
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org

Reply via email to