[jira] [Updated] (CASSANDRA-16668) Intermittent failure of SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race condition when shrinking maximum pool size to zero

2021-05-23 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16668:
---
Fix Version/s: (was: 4.0.x)
   (was: 4.0-rc)
   4.1
   4.0-rc2

> Intermittent failure of 
> SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race 
> condition when shrinking maximum pool size to zero
> -
>
> Key: CASSANDRA-16668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16668
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Matt Fleming
>Assignee: Matt Fleming
>Priority: Normal
> Fix For: 4.0-rc2, 4.0, 4.1
>
>
> A difficult-to-hit race condition exists in 
> changingMaxWorkersMeetsConcurrencyGoalsTest when changing the maximum pool 
> size from 0 -> 4 which results in the test failing like so:
> {{junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but 
> was:junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but was: at 
> org.apache.cassandra.concurrent.SEPExecutorTest.assertMaxTaskConcurrency(SEPExecutorTest.java:198)
>  at 
> org.apache.cassandra.concurrent.SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest(SEPExecutorTest.java:132)}}
> I can hit this issue maybe 2/3 times for every 100 invocations of the unit 
> test.
> The issue that causes the failure is that if tasks are still enqueued when 
> the maximum pool size is set to zero and if all of the SEPWorker threads 
> enter the STOP state before the pool size is bumped to 4, then no SEPWorker 
> threads will be spun up to service the task queue. This causes the above 
> error.
> Why don't we spin up SEPWorker threads when enqueing tasks? Because of the 
> guard logic in addTask: 
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L113,L121]
> In this scenario taskPermits will not be zero (because we have tasks on the 
> queue) so we never call {{maybeStartSpinningWorker()}}.
> A trick to make this issue much easier to hit is to insert a 
> {{Thread.sleep(500)}} immediately after setting the pool size to zero. This 
> has the effect of guaranteeing that all SEPWorker threads will be STOP'd 
> before enqueueing more work.
> Here's a fix that attempts to spin up an SEPWorker whenever we grow the 
> number of work permits: 
> https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16668) Intermittent failure of SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race condition when shrinking maximum pool size to zero

2021-05-21 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-16668:

Since Version:   (was: 4.0-rc1)

> Intermittent failure of 
> SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race 
> condition when shrinking maximum pool size to zero
> -
>
> Key: CASSANDRA-16668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16668
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Matt Fleming
>Assignee: Matt Fleming
>Priority: Normal
> Fix For: 4.0, 4.0-rc, 4.0.x
>
>
> A difficult-to-hit race condition exists in 
> changingMaxWorkersMeetsConcurrencyGoalsTest when changing the maximum pool 
> size from 0 -> 4 which results in the test failing like so:
> {{junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but 
> was:junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but was: at 
> org.apache.cassandra.concurrent.SEPExecutorTest.assertMaxTaskConcurrency(SEPExecutorTest.java:198)
>  at 
> org.apache.cassandra.concurrent.SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest(SEPExecutorTest.java:132)}}
> I can hit this issue maybe 2/3 times for every 100 invocations of the unit 
> test.
> The issue that causes the failure is that if tasks are still enqueued when 
> the maximum pool size is set to zero and if all of the SEPWorker threads 
> enter the STOP state before the pool size is bumped to 4, then no SEPWorker 
> threads will be spun up to service the task queue. This causes the above 
> error.
> Why don't we spin up SEPWorker threads when enqueing tasks? Because of the 
> guard logic in addTask: 
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L113,L121]
> In this scenario taskPermits will not be zero (because we have tasks on the 
> queue) so we never call {{maybeStartSpinningWorker()}}.
> A trick to make this issue much easier to hit is to insert a 
> {{Thread.sleep(500)}} immediately after setting the pool size to zero. This 
> has the effect of guaranteeing that all SEPWorker threads will be STOP'd 
> before enqueueing more work.
> Here's a fix that attempts to spin up an SEPWorker whenever we grow the 
> number of work permits: 
> https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16668) Intermittent failure of SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race condition when shrinking maximum pool size to zero

2021-05-21 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-16668:

  Fix Version/s: 4.0.x
 4.0
  Since Version: 4.0-rc1
Source Control Link: 
https://github.com/apache/cassandra/commit/8cd02afce972ecaf0e0cf0fe09c610d67d9af9c5
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Intermittent failure of 
> SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race 
> condition when shrinking maximum pool size to zero
> -
>
> Key: CASSANDRA-16668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16668
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Matt Fleming
>Assignee: Matt Fleming
>Priority: Normal
> Fix For: 4.0, 4.0-rc, 4.0.x
>
>
> A difficult-to-hit race condition exists in 
> changingMaxWorkersMeetsConcurrencyGoalsTest when changing the maximum pool 
> size from 0 -> 4 which results in the test failing like so:
> {{junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but 
> was:junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but was: at 
> org.apache.cassandra.concurrent.SEPExecutorTest.assertMaxTaskConcurrency(SEPExecutorTest.java:198)
>  at 
> org.apache.cassandra.concurrent.SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest(SEPExecutorTest.java:132)}}
> I can hit this issue maybe 2/3 times for every 100 invocations of the unit 
> test.
> The issue that causes the failure is that if tasks are still enqueued when 
> the maximum pool size is set to zero and if all of the SEPWorker threads 
> enter the STOP state before the pool size is bumped to 4, then no SEPWorker 
> threads will be spun up to service the task queue. This causes the above 
> error.
> Why don't we spin up SEPWorker threads when enqueing tasks? Because of the 
> guard logic in addTask: 
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L113,L121]
> In this scenario taskPermits will not be zero (because we have tasks on the 
> queue) so we never call {{maybeStartSpinningWorker()}}.
> A trick to make this issue much easier to hit is to insert a 
> {{Thread.sleep(500)}} immediately after setting the pool size to zero. This 
> has the effect of guaranteeing that all SEPWorker threads will be STOP'd 
> before enqueueing more work.
> Here's a fix that attempts to spin up an SEPWorker whenever we grow the 
> number of work permits: 
> https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16668) Intermittent failure of SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race condition when shrinking maximum pool size to zero

2021-05-21 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-16668:

Status: Ready to Commit  (was: Review In Progress)

> Intermittent failure of 
> SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race 
> condition when shrinking maximum pool size to zero
> -
>
> Key: CASSANDRA-16668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16668
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Matt Fleming
>Assignee: Matt Fleming
>Priority: Normal
> Fix For: 4.0-rc
>
>
> A difficult-to-hit race condition exists in 
> changingMaxWorkersMeetsConcurrencyGoalsTest when changing the maximum pool 
> size from 0 -> 4 which results in the test failing like so:
> {{junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but 
> was:junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but was: at 
> org.apache.cassandra.concurrent.SEPExecutorTest.assertMaxTaskConcurrency(SEPExecutorTest.java:198)
>  at 
> org.apache.cassandra.concurrent.SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest(SEPExecutorTest.java:132)}}
> I can hit this issue maybe 2/3 times for every 100 invocations of the unit 
> test.
> The issue that causes the failure is that if tasks are still enqueued when 
> the maximum pool size is set to zero and if all of the SEPWorker threads 
> enter the STOP state before the pool size is bumped to 4, then no SEPWorker 
> threads will be spun up to service the task queue. This causes the above 
> error.
> Why don't we spin up SEPWorker threads when enqueing tasks? Because of the 
> guard logic in addTask: 
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L113,L121]
> In this scenario taskPermits will not be zero (because we have tasks on the 
> queue) so we never call {{maybeStartSpinningWorker()}}.
> A trick to make this issue much easier to hit is to insert a 
> {{Thread.sleep(500)}} immediately after setting the pool size to zero. This 
> has the effect of guaranteeing that all SEPWorker threads will be STOP'd 
> before enqueueing more work.
> Here's a fix that attempts to spin up an SEPWorker whenever we grow the 
> number of work permits: 
> https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16668) Intermittent failure of SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race condition when shrinking maximum pool size to zero

2021-05-21 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-16668:

Reviewers: Andres de la Peña, Ekaterina Dimitrova, Jon Meredith  (was: 
Andres de la Peña, Jon Meredith)

> Intermittent failure of 
> SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race 
> condition when shrinking maximum pool size to zero
> -
>
> Key: CASSANDRA-16668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16668
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Matt Fleming
>Assignee: Matt Fleming
>Priority: Normal
> Fix For: 4.0-rc
>
>
> A difficult-to-hit race condition exists in 
> changingMaxWorkersMeetsConcurrencyGoalsTest when changing the maximum pool 
> size from 0 -> 4 which results in the test failing like so:
> {{junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but 
> was:junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but was: at 
> org.apache.cassandra.concurrent.SEPExecutorTest.assertMaxTaskConcurrency(SEPExecutorTest.java:198)
>  at 
> org.apache.cassandra.concurrent.SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest(SEPExecutorTest.java:132)}}
> I can hit this issue maybe 2/3 times for every 100 invocations of the unit 
> test.
> The issue that causes the failure is that if tasks are still enqueued when 
> the maximum pool size is set to zero and if all of the SEPWorker threads 
> enter the STOP state before the pool size is bumped to 4, then no SEPWorker 
> threads will be spun up to service the task queue. This causes the above 
> error.
> Why don't we spin up SEPWorker threads when enqueing tasks? Because of the 
> guard logic in addTask: 
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L113,L121]
> In this scenario taskPermits will not be zero (because we have tasks on the 
> queue) so we never call {{maybeStartSpinningWorker()}}.
> A trick to make this issue much easier to hit is to insert a 
> {{Thread.sleep(500)}} immediately after setting the pool size to zero. This 
> has the effect of guaranteeing that all SEPWorker threads will be STOP'd 
> before enqueueing more work.
> Here's a fix that attempts to spin up an SEPWorker whenever we grow the 
> number of work permits: 
> https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16668) Intermittent failure of SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race condition when shrinking maximum pool size to zero

2021-05-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-16668:
--
Reviewers: Andres de la Peña, Jon Meredith, Andres de la Peña  (was: Andres 
de la Peña, Jon Meredith)
   Andres de la Peña, Jon Meredith, Andres de la Peña  (was: Andres 
de la Peña, Jon Meredith)
   Status: Review In Progress  (was: Patch Available)

> Intermittent failure of 
> SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race 
> condition when shrinking maximum pool size to zero
> -
>
> Key: CASSANDRA-16668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16668
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Matt Fleming
>Assignee: Matt Fleming
>Priority: Normal
> Fix For: 4.0-rc
>
>
> A difficult-to-hit race condition exists in 
> changingMaxWorkersMeetsConcurrencyGoalsTest when changing the maximum pool 
> size from 0 -> 4 which results in the test failing like so:
> {{junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but 
> was:junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but was: at 
> org.apache.cassandra.concurrent.SEPExecutorTest.assertMaxTaskConcurrency(SEPExecutorTest.java:198)
>  at 
> org.apache.cassandra.concurrent.SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest(SEPExecutorTest.java:132)}}
> I can hit this issue maybe 2/3 times for every 100 invocations of the unit 
> test.
> The issue that causes the failure is that if tasks are still enqueued when 
> the maximum pool size is set to zero and if all of the SEPWorker threads 
> enter the STOP state before the pool size is bumped to 4, then no SEPWorker 
> threads will be spun up to service the task queue. This causes the above 
> error.
> Why don't we spin up SEPWorker threads when enqueing tasks? Because of the 
> guard logic in addTask: 
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L113,L121]
> In this scenario taskPermits will not be zero (because we have tasks on the 
> queue) so we never call {{maybeStartSpinningWorker()}}.
> A trick to make this issue much easier to hit is to insert a 
> {{Thread.sleep(500)}} immediately after setting the pool size to zero. This 
> has the effect of guaranteeing that all SEPWorker threads will be STOP'd 
> before enqueueing more work.
> Here's a fix that attempts to spin up an SEPWorker whenever we grow the 
> number of work permits: 
> https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16668) Intermittent failure of SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race condition when shrinking maximum pool size to zero

2021-05-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-16668:
--
Reviewers: Andres de la Peña, Jon Meredith  (was: Jon Meredith)

> Intermittent failure of 
> SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race 
> condition when shrinking maximum pool size to zero
> -
>
> Key: CASSANDRA-16668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16668
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Matt Fleming
>Assignee: Matt Fleming
>Priority: Normal
> Fix For: 4.0-rc
>
>
> A difficult-to-hit race condition exists in 
> changingMaxWorkersMeetsConcurrencyGoalsTest when changing the maximum pool 
> size from 0 -> 4 which results in the test failing like so:
> {{junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but 
> was:junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but was: at 
> org.apache.cassandra.concurrent.SEPExecutorTest.assertMaxTaskConcurrency(SEPExecutorTest.java:198)
>  at 
> org.apache.cassandra.concurrent.SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest(SEPExecutorTest.java:132)}}
> I can hit this issue maybe 2/3 times for every 100 invocations of the unit 
> test.
> The issue that causes the failure is that if tasks are still enqueued when 
> the maximum pool size is set to zero and if all of the SEPWorker threads 
> enter the STOP state before the pool size is bumped to 4, then no SEPWorker 
> threads will be spun up to service the task queue. This causes the above 
> error.
> Why don't we spin up SEPWorker threads when enqueing tasks? Because of the 
> guard logic in addTask: 
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L113,L121]
> In this scenario taskPermits will not be zero (because we have tasks on the 
> queue) so we never call {{maybeStartSpinningWorker()}}.
> A trick to make this issue much easier to hit is to insert a 
> {{Thread.sleep(500)}} immediately after setting the pool size to zero. This 
> has the effect of guaranteeing that all SEPWorker threads will be STOP'd 
> before enqueueing more work.
> Here's a fix that attempts to spin up an SEPWorker whenever we grow the 
> number of work permits: 
> https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16668) Intermittent failure of SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race condition when shrinking maximum pool size to zero

2021-05-13 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-16668:

Test and Documentation Plan: 
[https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01]
 Status: Patch Available  (was: In Progress)

> Intermittent failure of 
> SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race 
> condition when shrinking maximum pool size to zero
> -
>
> Key: CASSANDRA-16668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16668
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Matt Fleming
>Assignee: Matt Fleming
>Priority: Normal
> Fix For: 4.0-rc
>
>
> A difficult-to-hit race condition exists in 
> changingMaxWorkersMeetsConcurrencyGoalsTest when changing the maximum pool 
> size from 0 -> 4 which results in the test failing like so:
> {{junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but 
> was:junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but was: at 
> org.apache.cassandra.concurrent.SEPExecutorTest.assertMaxTaskConcurrency(SEPExecutorTest.java:198)
>  at 
> org.apache.cassandra.concurrent.SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest(SEPExecutorTest.java:132)}}
> I can hit this issue maybe 2/3 times for every 100 invocations of the unit 
> test.
> The issue that causes the failure is that if tasks are still enqueued when 
> the maximum pool size is set to zero and if all of the SEPWorker threads 
> enter the STOP state before the pool size is bumped to 4, then no SEPWorker 
> threads will be spun up to service the task queue. This causes the above 
> error.
> Why don't we spin up SEPWorker threads when enqueing tasks? Because of the 
> guard logic in addTask: 
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L113,L121]
> In this scenario taskPermits will not be zero (because we have tasks on the 
> queue) so we never call {{maybeStartSpinningWorker()}}.
> A trick to make this issue much easier to hit is to insert a 
> {{Thread.sleep(500)}} immediately after setting the pool size to zero. This 
> has the effect of guaranteeing that all SEPWorker threads will be STOP'd 
> before enqueueing more work.
> Here's a fix that attempts to spin up an SEPWorker whenever we grow the 
> number of work permits: 
> https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16668) Intermittent failure of SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race condition when shrinking maximum pool size to zero

2021-05-13 Thread Jon Meredith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Meredith updated CASSANDRA-16668:
-
 Bug Category: Parent values: Code(13163)Level 1 values: Bug - Unclear 
Impact(13164)
   Complexity: Normal
  Component/s: Local/Other
Discovered By: Unit Test
Fix Version/s: 4.0-rc
Reviewers: Jon Meredith
 Severity: Low
 Assignee: Matt Fleming
   Status: Open  (was: Triage Needed)

> Intermittent failure of 
> SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race 
> condition when shrinking maximum pool size to zero
> -
>
> Key: CASSANDRA-16668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16668
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Matt Fleming
>Assignee: Matt Fleming
>Priority: Normal
> Fix For: 4.0-rc
>
>
> A difficult-to-hit race condition exists in 
> changingMaxWorkersMeetsConcurrencyGoalsTest when changing the maximum pool 
> size from 0 -> 4 which results in the test failing like so:
> {{junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but 
> was:junit.framework.AssertionFailedError: Test tasks did not hit max 
> concurrency goal expected: but was: at 
> org.apache.cassandra.concurrent.SEPExecutorTest.assertMaxTaskConcurrency(SEPExecutorTest.java:198)
>  at 
> org.apache.cassandra.concurrent.SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest(SEPExecutorTest.java:132)}}
> I can hit this issue maybe 2/3 times for every 100 invocations of the unit 
> test.
> The issue that causes the failure is that if tasks are still enqueued when 
> the maximum pool size is set to zero and if all of the SEPWorker threads 
> enter the STOP state before the pool size is bumped to 4, then no SEPWorker 
> threads will be spun up to service the task queue. This causes the above 
> error.
> Why don't we spin up SEPWorker threads when enqueing tasks? Because of the 
> guard logic in addTask: 
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L113,L121]
> In this scenario taskPermits will not be zero (because we have tasks on the 
> queue) so we never call {{maybeStartSpinningWorker()}}.
> A trick to make this issue much easier to hit is to insert a 
> {{Thread.sleep(500)}} immediately after setting the pool size to zero. This 
> has the effect of guaranteeing that all SEPWorker threads will be STOP'd 
> before enqueueing more work.
> Here's a fix that attempts to spin up an SEPWorker whenever we grow the 
> number of work permits: 
> https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org