[jira] [Updated] (CASSANDRA-6788) Race condition silently kills thrift server
[ https://issues.apache.org/jira/browse/CASSANDRA-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-6788: Fix Version/s: 1.2.17 Race condition silently kills thrift server --- Key: CASSANDRA-6788 URL: https://issues.apache.org/jira/browse/CASSANDRA-6788 Project: Cassandra Issue Type: Bug Reporter: Christian Rolf Assignee: Christian Rolf Fix For: 1.2.17, 2.0.7, 2.1 beta2 Attachments: 6788-v2.txt, 6788-v3.txt, 6793-v3-rebased.txt, race_patch.diff There's a race condition in CustomTThreadPoolServer that can cause the thrift server to silently stop listening for connections. It happens when the executor service throws a RejectedExecutionException, which is not caught. Silent in the sense that OpsCenter doesn't notice any problem since JMX is still running fine. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6788) Race condition silently kills thrift server
[ https://issues.apache.org/jira/browse/CASSANDRA-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-6788: - Reproduced In: 2.0.5, 2.0.4, 1.2.16 (was: 2.0.4, 2.0.5) Race condition silently kills thrift server --- Key: CASSANDRA-6788 URL: https://issues.apache.org/jira/browse/CASSANDRA-6788 Project: Cassandra Issue Type: Bug Reporter: Christian Rolf Assignee: Christian Rolf Fix For: 2.0.7, 2.1 beta2 Attachments: 6788-v2.txt, 6788-v3.txt, 6793-v3-rebased.txt, race_patch.diff There's a race condition in CustomTThreadPoolServer that can cause the thrift server to silently stop listening for connections. It happens when the executor service throws a RejectedExecutionException, which is not caught. Silent in the sense that OpsCenter doesn't notice any problem since JMX is still running fine. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6788) Race condition silently kills thrift server
[ https://issues.apache.org/jira/browse/CASSANDRA-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6788: -- Attachment: 6793-v3-rebased.txt Race condition silently kills thrift server --- Key: CASSANDRA-6788 URL: https://issues.apache.org/jira/browse/CASSANDRA-6788 Project: Cassandra Issue Type: Bug Reporter: Christian Rolf Assignee: Christian Rolf Attachments: 6788-v2.txt, 6788-v3.txt, 6793-v3-rebased.txt, race_patch.diff There's a race condition in CustomTThreadPoolServer that can cause the thrift server to silently stop listening for connections. It happens when the executor service throws a RejectedExecutionException, which is not caught. Silent in the sense that OpsCenter doesn't notice any problem since JMX is still running fine. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6788) Race condition silently kills thrift server
[ https://issues.apache.org/jira/browse/CASSANDRA-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Rolf updated CASSANDRA-6788: -- Attachment: 6788-v3.txt True, given that the ThreadPoolExecutor.afterExecute is a noop, the exception should be rare. Here's an alternative solution...in the hope that you prefer overrides to factories and extra exception handling as much as I do :-) Race condition silently kills thrift server --- Key: CASSANDRA-6788 URL: https://issues.apache.org/jira/browse/CASSANDRA-6788 Project: Cassandra Issue Type: Bug Reporter: Christian Rolf Assignee: Christian Rolf Attachments: 6788-v2.txt, 6788-v3.txt, race_patch.diff There's a race condition in CustomTThreadPoolServer that can cause the thrift server to silently stop listening for connections. It happens when the executor service throws a RejectedExecutionException, which is not caught. Silent in the sense that OpsCenter doesn't notice any problem since JMX is still running fine. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6788) Race condition silently kills thrift server
[ https://issues.apache.org/jira/browse/CASSANDRA-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6788: -- Reproduced In: 2.0.5, 2.0.4 (was: 2.0.4, 2.0.5) Priority: Major (was: Critical) Race condition silently kills thrift server --- Key: CASSANDRA-6788 URL: https://issues.apache.org/jira/browse/CASSANDRA-6788 Project: Cassandra Issue Type: Bug Reporter: Christian Rolf Assignee: Christian Rolf Attachments: race_patch.diff There's a race condition in CustomTThreadPoolServer that can cause the thrift server to silently stop listening for connections. It happens when the executor service throws a RejectedExecutionException, which is not caught. Silent in the sense that OpsCenter doesn't notice any problem since JMX is still running fine. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (CASSANDRA-6788) Race condition silently kills thrift server
[ https://issues.apache.org/jira/browse/CASSANDRA-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6788: -- Attachment: 6788-v2.txt I see. But that doesn't eliminate the window for a race, just reduces it. (TPE.runWorker still needs to call afterExecute and do its own bookkeeping.) v2 adds an explicit catch for REE. This is better than dying, but it will accept connections and then drop them on the floor if necessary which is obviously sub optimal. Moral is not to push right up to the edge of max connections. :) Race condition silently kills thrift server --- Key: CASSANDRA-6788 URL: https://issues.apache.org/jira/browse/CASSANDRA-6788 Project: Cassandra Issue Type: Bug Reporter: Christian Rolf Assignee: Christian Rolf Attachments: 6788-v2.txt, race_patch.diff There's a race condition in CustomTThreadPoolServer that can cause the thrift server to silently stop listening for connections. It happens when the executor service throws a RejectedExecutionException, which is not caught. Silent in the sense that OpsCenter doesn't notice any problem since JMX is still running fine. -- This message was sent by Atlassian JIRA (v6.1.5#6160)