Ignace Desimpel created CASSANDRA-5716: ------------------------------------------
Summary: Remark on cassandra-5273 : Hanging system after OutOfMemory. Server cannot die due to uncaughtException handling Key: CASSANDRA-5716 URL: https://issues.apache.org/jira/browse/CASSANDRA-5716 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.2.6 Environment: linux Reporter: Ignace Desimpel Priority: Minor Fix For: 1.2.6 Possible incorrect handling of an OOM as a result of modifications made for issue cassandra-5273. I could reproduce the OOM, with the patch of Cassandra-5273 applied. The good news is that, at least in my case, it works fine : the system did die ! However, due to multiple uncaughtException handling, multiple threads are calling the exitThread.start() routine, causing an IllegalStateException. There are some other exceptions also, but that seems logical. Also, after calling the start() function, the thread(s) will continue to run, and that could not be wanted. Below I pasted the stack trace. Just for your information, after all this works, and I could restart the Cassandra server and redo the OOM 2013-06-27 16:28:15.384 Unable to reduce heap usage since there are no dirty column families java.lang.OutOfMemoryError: Java heap space Dumping heap to java_pid4808.hprof ... Heap dump file created [278960302 bytes in 2.659 secs] 2013-06-27 16:28:42.655 Exception in thread Thread[qtp1564441079-31,5,main] java.lang.OutOfMemoryError: Java heap space 2013-06-27 16:28:42.655 Exception in thread Thread[qtp1564441079-36,5,main] java.lang.OutOfMemoryError: Java heap space 2013-06-27 16:28:42.655 Exception in thread Thread[qtp1564441079-30,5,main] java.lang.OutOfMemoryError: Java heap space 2013-06-27 16:28:42.655 Exception in thread Thread[GossipTasks:1,5,main] java.lang.OutOfMemoryError: Java heap space 2013-06-27 16:28:42.655 Exception in thread Thread[OptionalTasks:1,5,main] java.lang.OutOfMemoryError: Java heap space 2013-06-27 16:28:42.655 Exception in thread Thread[PERIODIC-COMMIT-LOG-SYNCER,5,main] java.lang.OutOfMemoryError: Java heap space 2013-06-27 16:28:42.655 Exception in thread Thread[metrics-meter-tick-thread-2,5,main] java.lang.OutOfMemoryError: Java heap space 2013-06-27 16:28:42.656 Exception in thread Thread[EXPIRING-MAP-REAPER:1,5,main] java.lang.OutOfMemoryError: Java heap space at java.util.concurrent.ConcurrentHashMap$EntrySet.iterator(ConcurrentHashMap.java:1202) ~[na:1.6.0_29] at org.apache.cassandra.utils.ExpiringMap$1.run(ExpiringMap.java:88) ~[thrift/:na] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:75) ~[thrift/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) [na:1.6.0_29] at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) [na:1.6.0_29] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) [na:1.6.0_29] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) [na:1.6.0_29] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) [na:1.6.0_29] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) [na:1.6.0_29] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_29] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_29] at java.lang.Thread.run(Thread.java:662) [na:1.6.0_29] 2013-06-27 16:28:42.656 Exception in thread Thread[ScheduledTasks:1,5,main] java.lang.OutOfMemoryError: Java heap space at com.yammer.metrics.stats.Snapshot.<init>(Snapshot.java:30) ~[metrics-core-2.0.3.jar:na] at com.yammer.metrics.stats.ExponentiallyDecayingSample.getSnapshot(ExponentiallyDecayingSample.java:107) ~[metrics-core-2.0.3.jar:na] at org.apache.cassandra.locator.DynamicEndpointSnitch.updateScores(DynamicEndpointSnitch.java:237) ~[thrift/:na] at org.apache.cassandra.locator.DynamicEndpointSnitch.access$0(DynamicEndpointSnitch.java:217) ~[thrift/:na] at org.apache.cassandra.locator.DynamicEndpointSnitch$1.run(DynamicEndpointSnitch.java:71) ~[thrift/:na] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:75) ~[thrift/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) [na:1.6.0_29] at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) [na:1.6.0_29] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) [na:1.6.0_29] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) [na:1.6.0_29] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) [na:1.6.0_29] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) [na:1.6.0_29] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_29] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_29] at java.lang.Thread.run(Thread.java:662) [na:1.6.0_29] 2013-06-27 16:28:42.656 Exception in thread Thread[Thrift:3,5,main] java.lang.OutOfMemoryError: Java heap space at java.util.BitSet.initWords(BitSet.java:144) ~[na:1.6.0_29] at java.util.BitSet.<init>(BitSet.java:139) ~[na:1.6.0_29] at org.apache.cassandra.thrift.Column.<init>(Column.java:135) ~[thrift/:na] at org.apache.cassandra.thrift.Column.<init>(Column.java:158) ~[thrift/:na] at org.apache.cassandra.thrift.CassandraServer.thriftifyColumns(CassandraServer.java:184) ~[thrift/:na] at org.apache.cassandra.thrift.CassandraServer.thriftifyColumnFamily(CassandraServer.java:285) ~[thrift/:na] at org.apache.cassandra.thrift.CassandraServer.thriftifyKeySlices(CassandraServer.java:1071) ~[thrift/:na] at org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:960) ~[thrift/:na] at org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.getResult(Cassandra.java:3454) ~[thrift/:na] at org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.getResult(Cassandra.java:1) ~[thrift/:na] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32) ~[libthrift-0.7.0.jar:0.7.0] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) ~[libthrift-0.7.0.jar:0.7.0] at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199) ~[thrift/:na] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) ~[na:1.6.0_29] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) ~[na:1.6.0_29] at java.lang.Thread.run(Thread.java:662) ~[na:1.6.0_29] 2013-06-27 16:28:42.655 Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] java.lang.OutOfMemoryError: Java heap space 2013-06-27 16:28:42.655 java.lang.OutOfMemoryError: Java heap space 2013-06-27 16:28:42.656 Failed deleting temp components for ..\cassandra\dbdatafile\system\schema_keyspaces\system-schema_keyspaces-tmp-ic-16 org.apache.cassandra.io.FSWriteError: java.io.IOException: Failed to delete C:\develop\configs\AnnotationServer7\bin\..\cassandra\dbdatafile\system\schema_keyspaces\system-schema_keyspaces-tmp-ic-16-Statistics.db at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:112) ~[thrift/:na] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:103) ~[thrift/:na] at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:146) ~[thrift/:na] at org.apache.cassandra.io.sstable.SSTableWriter.abort(SSTableWriter.java:309) ~[thrift/:na] at org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:509) [thrift/:na] at org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:448) [thrift/:na] at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) [thrift/:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) [thrift/:na] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_29] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_29] at java.lang.Thread.run(Thread.java:662) [na:1.6.0_29] Caused by: java.io.IOException: Failed to delete C:\develop\configs\AnnotationServer7\bin\..\cassandra\dbdatafile\system\schema_keyspaces\system-schema_keyspaces-tmp-ic-16-Statistics.db ... 11 common frames omitted 2013-06-27 16:28:42.656 Exception in thread Thread[FlushWriter:2,5,main] org.apache.cassandra.io.FSWriteError: java.io.IOException: Failed to delete C:\develop\configs\AnnotationServer7\bin\..\cassandra\dbdatafile\system\schema_keyspaces\system-schema_keyspaces-tmp-ic-16-Statistics.db at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:112) ~[thrift/:na] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:103) ~[thrift/:na] at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:146) ~[thrift/:na] at org.apache.cassandra.io.sstable.SSTableWriter.abort(SSTableWriter.java:309) ~[thrift/:na] at org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:509) ~[thrift/:na] at org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:448) ~[thrift/:na] at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) ~[thrift/:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[thrift/:na] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) ~[na:1.6.0_29] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) ~[na:1.6.0_29] at java.lang.Thread.run(Thread.java:662) ~[na:1.6.0_29] Caused by: java.io.IOException: Failed to delete C:\develop\configs\AnnotationServer7\bin\..\cassandra\dbdatafile\system\schema_keyspaces\system-schema_keyspaces-tmp-ic-16-Statistics.db ... 11 common frames omitted 2013-06-27 16:28:42.656 Stopping gossiper 2013-06-27 16:28:42.656 Stopping gossip by operator request 2013-06-27 16:28:42.656 Announcing shutdown 2013-06-27 16:28:44.684 Stopping RPC server 2013-06-27 16:28:44.684 Stop listening to thrift clients 2013-06-27 16:29:06.743 Exception in thread Thread[OptionalTasks:1,5,main] java.lang.IllegalThreadStateException: null at java.lang.Thread.start(Thread.java:638) ~[na:1.6.0_29] at org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:189) ~[thrift/:na] at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.handleOrLog(DebuggableThreadPoolExecutor.java:213) ~[thrift/:na] at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.logExceptionsAfterExecute(DebuggableThreadPoolExecutor.java:196) ~[thrift/:na] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor.afterExecute(DebuggableScheduledThreadPoolExecutor.java:46) ~[thrift/:na] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888) ~[na:1.6.0_29] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) ~[na:1.6.0_29] at java.lang.Thread.run(Thread.java:662) ~[na:1.6.0_29] 2013-06-27 16:29:06.743 Exception in thread Thread[ScheduledTasks:1,5,main] java.lang.IllegalThreadStateException: null at java.lang.Thread.start(Thread.java:638) [na:1.6.0_29] at org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:189) ~[thrift/:na] at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.handleOrLog(DebuggableThreadPoolExecutor.java:213) ~[thrift/:na] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:79) ~[thrift/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) ~[na:1.6.0_29] at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) ~[na:1.6.0_29] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) ~[na:1.6.0_29] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) ~[na:1.6.0_29] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) ~[na:1.6.0_29] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) ~[na:1.6.0_29] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_29] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_29] at java.lang.Thread.run(Thread.java:662) [na:1.6.0_29] 2013-06-27 16:29:06.743 Exception in thread Thread[EXPIRING-MAP-REAPER:1,5,main] java.lang.IllegalThreadStateException: null at java.lang.Thread.start(Thread.java:638) [na:1.6.0_29] at org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:189) ~[thrift/:na] at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.handleOrLog(DebuggableThreadPoolExecutor.java:213) ~[thrift/:na] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:79) ~[thrift/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) ~[na:1.6.0_29] at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) ~[na:1.6.0_29] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) ~[na:1.6.0_29] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) ~[na:1.6.0_29] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) ~[na:1.6.0_29] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) ~[na:1.6.0_29] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_29] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_29] at java.lang.Thread.run(Thread.java:662) [na:1.6.0_29] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira