Hello. I have a six node cassandra cluster running on modest hardware with 1G of heap assigned to cassandra. After inserting about 245 million rows of data, cassandra failed with a java.lang.OutOfMemoryError: Java heap space error. I rasied the java heap to 2G, but still get the same error when trying to restart cassandra.
I am using Cassandra 0.5.1 with Sun jre1.6.0_18. Any thoughts on how to resolve this issue are greatly appreciated. Here are log excerpts from two of the nodes: DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 SliceQueryFilter.java (line 116) collecting SuperColumn(dcf9f19e [0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 SliceQueryFilter.java (line 116) collecting SuperColumn(dd04bf9c [0a011d0c,0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 SliceQueryFilter.java (line 116) collecting SuperColumn(dd08981a [0a011d0c,0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 SliceQueryFilter.java (line 116) collecting SuperColumn(dd7f7ac9 [0a011d0c,0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 SliceQueryFilter.java (line 116) collecting SuperColumn(dde1d4cf [0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 SliceQueryFilter.java (line 116) collecting SuperColumn(de32aec3 [0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 SliceQueryFilter.java (line 116) collecting SuperColumn(de378105 [0a011d0c,0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 SliceQueryFilter.java (line 116) collecting SuperColumn(deb5d591 [0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 SliceQueryFilter.java (line 116) collecting SuperColumn(ded75dee [0a011d0c,0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 SliceQueryFilter.java (line 116) collecting SuperColumn(defe3445 [0a011d0c,0a011d0d,]) INFO [FLUSH-TIMER] 2010-04-23 16:20:00,071 ColumnFamilyStore.java (line 393) IpTag has reached its threshold; switching in a fresh Memtable INFO [FLUSH-TIMER] 2010-04-23 16:20:00,072 ColumnFamilyStore.java (line 1035) Enqueuing flush of Memtable(IpTag)@7816 INFO [FLUSH-SORTER-POOL:1] 2010-04-23 16:20:00,072 Memtable.java (line 183) Sorting Memtable(IpTag)@7816 INFO [FLUSH-WRITER-POOL:1] 2010-04-23 16:20:00,107 Memtable.java (line 192) Writing Memtable(IpTag)@7816 DEBUG [Timer-0] 2010-04-23 16:20:00,130 LoadDisseminator.java (line 39) Disseminating load info ... ERROR [ROW-MUTATION-STAGE:41] 2010-04-23 16:20:00,348 CassandraDaemon.java (line 71) Fatal exception in thread Thread[ROW-MUTATION-STAGE:41,5,main] java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Unknown Source) at java.lang.String.<init>(Unknown Source) at java.lang.StringBuilder.toString(Unknown Source) at org.apache.cassandra.db.marshal.AbstractType.getColumnsString(AbstractType.java:87) at org.apache.cassandra.db.ColumnFamily.toString(ColumnFamily.java:344) at org.apache.commons.lang.ObjectUtils.toString(ObjectUtils.java:241) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3073) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3133) at org.apache.cassandra.db.RowMutation.toString(RowMutation.java:263) at java.lang.String.valueOf(Unknown Source) at java.lang.StringBuilder.append(Unknown Source) at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:46) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:38) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) --- DEBUG [main] 2010-04-23 17:15:45,501 CommitLog.java (line 312) Reading mutation at 57527476 DEBUG [main] 2010-04-23 17:16:11,375 CommitLog.java (line 340) replaying mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c5c0,])} DEBUG [main] 2010-04-23 17:16:45,293 CommitLog.java (line 312) Reading mutation at 57527686 DEBUG [main] 2010-04-23 17:16:45,294 CommitLog.java (line 340) replaying mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c5fb,])} DEBUG [main] 2010-04-23 17:16:54,311 CommitLog.java (line 312) Reading mutation at 57527919 DEBUG [main] 2010-04-23 17:17:46,344 CommitLog.java (line 340) replaying mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c5fb,])} DEBUG [main] 2010-04-23 17:17:55,530 CommitLog.java (line 312) Reading mutation at 57528129 DEBUG [main] 2010-04-23 17:18:20,266 CommitLog.java (line 340) replaying mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c607,])} DEBUG [main] 2010-04-23 17:18:38,273 CommitLog.java (line 312) Reading mutation at 57528362 DEBUG [main] 2010-04-23 17:21:53,966 CommitLog.java (line 340) replaying mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c607,])} DEBUG [main] 2010-04-23 17:24:48,032 CommitLog.java (line 312) Reading mutation at 57528572 ERROR [RMI TCP Connection(idle)] 2010-04-23 17:36:38,932 CassandraDaemon.java (line 71) Fatal exception in thread Thread[RMI TCP Connection(idle),5,RMI Runtime] java.lang.OutOfMemoryError: Java heap space ERROR [RMI TCP Connection(idle)] 2010-04-23 17:36:38,952 CassandraDaemon.java (line 71) Fatal exception in thread Thread[RMI TCP Connection(idle),5,RMI Runtime] java.lang.OutOfMemoryError: Java heap space ERROR [RMI TCP Connection(idle)] 2010-04-23 17:36:38,952 CassandraDaemon.java (line 71) Fatal exception in thread Thread[RMI TCP Connection(idle),5,RMI Runtime] java.lang.OutOfMemoryError: Java heap space at java.io.BufferedInputStream.<init>(Unknown Source) at java.io.BufferedInputStream.<init>(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) ERROR [RMI TCP Connection(idle)] 2010-04-23 17:36:38,966 CassandraDaemon.java (line 71) Fatal exception in thread Thread[RMI TCP Connection(idle),5,RMI Runtime] java.lang.OutOfMemoryError: Java heap space ERROR [main] 2010-04-23 17:36:38,966 CassandraDaemon.java (line 184) Exception encountered during startup. java.lang.OutOfMemoryError: Java heap space ERROR [RMI TCP Connection(idle)] 2010-04-23 17:36:38,981 CassandraDaemon.java (line 71) Fatal exception in thread Thread[RMI TCP Connection(idle),5,RMI Runtime] java.lang.OutOfMemoryError: Java heap space Here is my current configuration: <Partitioner>org.apache.cassandra.dht.OrderPreservingPartitioner</Partitioner> <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy> <ReplicationFactor>3</ReplicationFactor> <RpcTimeoutInMillis>30000</RpcTimeoutInMillis> <CommitLogRotationThresholdInMB>128</CommitLogRotationThresholdInMB> <SlicedBufferSizeInKB>64</SlicedBufferSizeInKB> <FlushDataBufferSizeInMB>32</FlushDataBufferSizeInMB> <ColumnIndexSizeInKB>64</ColumnIndexSizeInKB> <MemtableSizeInMB>64</MemtableSizeInMB> <MemtableObjectCountInMillions>0.1</MemtableObjectCountInMillions> <MemtableFlushAfterMinutes>60</MemtableFlushAfterMinutes> <ConcurrentReads>8</ConcurrentReads> <ConcurrentWrites>100</ConcurrentWrites> <CommitLogSync>periodic</CommitLogSync> <CommitLogSyncPeriodInMS>1000</CommitLogSyncPeriodInMS> <GCGraceSeconds>864000</GCGraceSeconds> <BinaryMemtableSizeInMB>256</BinaryMemtableSizeInMB> Ring status: Address Status Load Range Ring f 10.1.29.12 Down 7.26 GB 0 |<--| 10.1.29.13 Up 3.97 GB 3 | ^ 10.1.29.14 Up 7.73 GB 6 v | 10.1.29.15 Down 14.27 GB 9 | ^ 10.1.29.16 Up 15.42 GB c v | 10.1.29.17 Down 12.67 GB f |-->|