[ https://issues.apache.org/jira/browse/CASSANDRA-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Semb Wever updated CASSANDRA-16668: ------------------------------------------- Fix Version/s: (was: 4.0.x) (was: 4.0-rc) 4.1 4.0-rc2 > Intermittent failure of > SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race > condition when shrinking maximum pool size to zero > ----------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-16668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16668 > Project: Cassandra > Issue Type: Bug > Components: Local/Other > Reporter: Matt Fleming > Assignee: Matt Fleming > Priority: Normal > Fix For: 4.0-rc2, 4.0, 4.1 > > > A difficult-to-hit race condition exists in > changingMaxWorkersMeetsConcurrencyGoalsTest when changing the maximum pool > size from 0 -> 4 which results in the test failing like so: > {{junit.framework.AssertionFailedError: Test tasks did not hit max > concurrency goal expected:<true> but > was:<false>junit.framework.AssertionFailedError: Test tasks did not hit max > concurrency goal expected:<true> but was:<false> at > org.apache.cassandra.concurrent.SEPExecutorTest.assertMaxTaskConcurrency(SEPExecutorTest.java:198) > at > org.apache.cassandra.concurrent.SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest(SEPExecutorTest.java:132)}} > I can hit this issue maybe 2/3 times for every 100 invocations of the unit > test. > The issue that causes the failure is that if tasks are still enqueued when > the maximum pool size is set to zero and if all of the SEPWorker threads > enter the STOP state before the pool size is bumped to 4, then no SEPWorker > threads will be spun up to service the task queue. This causes the above > error. > Why don't we spin up SEPWorker threads when enqueing tasks? Because of the > guard logic in addTask: > [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L113,L121] > In this scenario taskPermits will not be zero (because we have tasks on the > queue) so we never call {{maybeStartSpinningWorker()}}. > A trick to make this issue much easier to hit is to insert a > {{Thread.sleep(500)}} immediately after setting the pool size to zero. This > has the effect of guaranteeing that all SEPWorker threads will be STOP'd > before enqueueing more work. > Here's a fix that attempts to spin up an SEPWorker whenever we grow the > number of work permits: > https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01 -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org