[jira] [Commented] (CASSANDRA-5273) Hanging system after OutOfMemory. Server cannot die due to uncaughtException handling
[ https://issues.apache.org/jira/browse/CASSANDRA-5273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664186#comment-13664186 ] Marcus Eriksson commented on CASSANDRA-5273: lgtm > Hanging system after OutOfMemory. Server cannot die due to uncaughtException > handling > - > > Key: CASSANDRA-5273 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5273 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 > Environment: linux, 64 bit >Reporter: Ignace Desimpel >Assignee: Marcus Eriksson >Priority: Minor > Fix For: 2.0 > > Attachments: > 0001-CASSANDRA-5273-add-timeouts-to-the-blocking-commitlo.patch, > 0001-CASSANDRA-5273-add-timeouts-to-the-blocking-commitlo.patch, 5273-v2.txt, > 5273-v3.txt, CassHangs.txt > > > On out of memory exception, there is an uncaughtexception handler that is > calling System.exit(). However, multiple threads are calling this handler > causing a deadlock and the server cannot stop working. See > http://www.mail-archive.com/user@cassandra.apache.org/msg27898.html. And see > stack trace in attachement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5273) Hanging system after OutOfMemory. Server cannot die due to uncaughtException handling
[ https://issues.apache.org/jira/browse/CASSANDRA-5273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663065#comment-13663065 ] Marcus Eriksson commented on CASSANDRA-5273: [~jbellis] you think the timeouts would be enough? > Hanging system after OutOfMemory. Server cannot die due to uncaughtException > handling > - > > Key: CASSANDRA-5273 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5273 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 > Environment: linux, 64 bit >Reporter: Ignace Desimpel >Assignee: Marcus Eriksson >Priority: Minor > Fix For: 2.0 > > Attachments: > 0001-CASSANDRA-5273-add-timeouts-to-the-blocking-commitlo.patch, > 0001-CASSANDRA-5273-add-timeouts-to-the-blocking-commitlo.patch, CassHangs.txt > > > On out of memory exception, there is an uncaughtexception handler that is > calling System.exit(). However, multiple threads are calling this handler > causing a deadlock and the server cannot stop working. See > http://www.mail-archive.com/user@cassandra.apache.org/msg27898.html. And see > stack trace in attachement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5273) Hanging system after OutOfMemory. Server cannot die due to uncaughtException handling
[ https://issues.apache.org/jira/browse/CASSANDRA-5273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644517#comment-13644517 ] Jonathan Ellis commented on CASSANDRA-5273: --- bq. the threads that would have called System.exit I don't think we're very rigorous about calling Thread.setDaemon, so I think this will actually deadlock it -- System.exit will wait for daemon threads to die, and the daemon threads will park at the lock acquisition. > Hanging system after OutOfMemory. Server cannot die due to uncaughtException > handling > - > > Key: CASSANDRA-5273 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5273 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 > Environment: linux, 64 bit >Reporter: Ignace Desimpel >Assignee: Marcus Eriksson >Priority: Minor > Fix For: 1.2.5 > > Attachments: > 0001-CASSANDRA-5273-add-timeouts-to-the-blocking-commitlo.patch, > 0001-CASSANDRA-5273-add-timeouts-to-the-blocking-commitlo.patch, CassHangs.txt > > > On out of memory exception, there is an uncaughtexception handler that is > calling System.exit(). However, multiple threads are calling this handler > causing a deadlock and the server cannot stop working. See > http://www.mail-archive.com/user@cassandra.apache.org/msg27898.html. And see > stack trace in attachement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5273) Hanging system after OutOfMemory. Server cannot die due to uncaughtException handling
[ https://issues.apache.org/jira/browse/CASSANDRA-5273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13640207#comment-13640207 ] Ignace Desimpel commented on CASSANDRA-5273: Just an idea : one could say that the problem is caused by the java runtime that is holding a lock during System.exit(). At the same time, the cassandra code (the uncaught exception handler) is potentially calling System.exit() many times. Would it not be more safe and clean for the code in the handler to call at most once System.exit(), avoiding the jre lock and letting everything die in a 'normal' way? > Hanging system after OutOfMemory. Server cannot die due to uncaughtException > handling > - > > Key: CASSANDRA-5273 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5273 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 > Environment: linux, 64 bit >Reporter: Ignace Desimpel >Assignee: Marcus Eriksson >Priority: Minor > Fix For: 1.2.5 > > Attachments: > 0001-CASSANDRA-5273-add-timeouts-to-the-blocking-commitlo.patch, CassHangs.txt > > > On out of memory exception, there is an uncaughtexception handler that is > calling System.exit(). However, multiple threads are calling this handler > causing a deadlock and the server cannot stop working. See > http://www.mail-archive.com/user@cassandra.apache.org/msg27898.html. And see > stack trace in attachement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5273) Hanging system after OutOfMemory. Server cannot die due to uncaughtException handling
[ https://issues.apache.org/jira/browse/CASSANDRA-5273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13611873#comment-13611873 ] Jonathan Ellis commented on CASSANDRA-5273: --- Do we have any better options than just adding a timeout to the appendingThread.join call? > Hanging system after OutOfMemory. Server cannot die due to uncaughtException > handling > - > > Key: CASSANDRA-5273 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5273 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 > Environment: linux, 64 bit >Reporter: Ignace Desimpel >Priority: Minor > Attachments: CassHangs.txt > > > On out of memory exception, there is an uncaughtexception handler that is > calling System.exit(). However, multiple threads are calling this handler > causing a deadlock and the server cannot stop working. See > http://www.mail-archive.com/user@cassandra.apache.org/msg27898.html. And see > stack trace in attachement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira