Yu-Lin Chen created YUNIKORN-2576: ------------------------------------- Summary: Data Race: Flaky tests in dispatcher_test.go Key: YUNIKORN-2576 URL: https://issues.apache.org/jira/browse/YUNIKORN-2576 Project: Apache YuniKorn Issue Type: Bug Reporter: Yu-Lin Chen Attachments: shim-race.txt
How to reproduce: # In Shim, run 'go test ./pkg/... -race -count=10 > shim-race.txt' {code:java} WARNING: DATA RACE Write at 0x0000035315e0 by goroutine 88: github.com/apache/yunikorn-k8shim/pkg/dispatcher.initDispatcher() /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:73 +0x2c4 github.com/apache/yunikorn-k8shim/pkg/dispatcher.createDispatcher() /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatch_test.go:305 +0x2f runtime.Goexit() /usr/local/go/src/runtime/panic.go:626 +0x5d testing.(*T).FailNow() <autogenerated>:1 +0x31 gotest.tools/v3/assert.Equal() /home/chenyulin0719/go/pkg/mod/gotest.tools/v3@v3.5.1/assert/assert.go:205 +0x1aa github.com/apache/yunikorn-k8shim/pkg/dispatcher.TestDispatchTimeout() /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatch_test.go:244 +0x2ba testing.tRunner() /usr/local/go/src/testing/testing.go:1689 +0x21e testing.(*T).Run.gowrap1() /usr/local/go/src/testing/testing.go:1742 +0x44Previous read at 0x0000035315e0 by goroutine 90: github.com/apache/yunikorn-k8shim/pkg/dispatcher.(*Dispatcher).asyncDispatch.func1() /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:188 +0x2f5 github.com/apache/yunikorn-k8shim/pkg/dispatcher.(*Dispatcher).asyncDispatch.gowrap1() /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:197 +0x6eGoroutine 88 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:1742 +0x825 testing.runTests.func1() /usr/local/go/src/testing/testing.go:2161 +0x85 testing.tRunner() /usr/local/go/src/testing/testing.go:1689 +0x21e testing.runTests() /usr/local/go/src/testing/testing.go:2159 +0x8be testing.(*M).Run() /usr/local/go/src/testing/testing.go:2027 +0xf17 main.main() _testmain.go:55 +0x2bdGoroutine 90 (running) created at: github.com/apache/yunikorn-k8shim/pkg/dispatcher.(*Dispatcher).asyncDispatch() /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:178 +0x391 github.com/apache/yunikorn-k8shim/pkg/dispatcher.(*Dispatcher).dispatch() /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:164 +0xbb github.com/apache/yunikorn-k8shim/pkg/dispatcher.Dispatch() /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:142 +0x71 github.com/apache/yunikorn-k8shim/pkg/dispatcher.TestDispatchTimeout() /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatch_test.go:232 +0x244 testing.tRunner() /usr/local/go/src/testing/testing.go:1689 +0x21e testing.(*T).Run.gowrap1() /usr/local/go/src/testing/testing.go:1742 +0x44 =================={code} Root Cause: * The [globla vairables|https://github.com/chenyulin0719/yunikorn-k8shim/blob/64b204a2fb3b83fde9d86ea58f5f0d1e42187472/pkg/dispatcher/dispatcher.go#L46-L51] in dispatcher.go is not protected when running unit tests. Each unit test will run initDispatcher() through [createDispatcher()|https://github.com/chenyulin0719/yunikorn-k8shim/blob/64b204a2fb3b83fde9d86ea58f5f0d1e42187472/pkg/dispatcher/dispatch_test.go#L305]. * Race occurs if any other unit tests read/write the global variables before or after initDispatcher(). ex: TestDispatchTimeout() [https://github.com/chenyulin0719/yunikorn-k8shim/blob/64b204a2fb3b83fde9d86ea58f5f0d1e42187472/pkg/dispatcher/dispatcher.go#L188] Solution to be discussed: # Refactor dispatcher.go and encapsulates global variables to Dispatcher struct , change Dispatcher.Start(), Dispatcher.Stop() to type method # Implement Singleton in getDispatcher() and add a new function newDispatcher() # Create a new Dispatcher for each unit test The race issue only happens in unit test becasue the shared vairable was protected by once.Do(initDispatcher) in dispatcher.go : [https://github.com/chenyulin0719/yunikorn-k8shim/blob/64b204a2fb3b83fde9d86ea58f5f0d1e42187472/pkg/dispatcher/dispatcher.go#L131] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org For additional commands, e-mail: dev-h...@yunikorn.apache.org