I'd suggest setting some cassandra jvm parameters so that you can analyze a
heap dump and peek through the gc logs.  That'll give you some clues e.g.
if the memory problem is growing steadily or suddenly, and clues from a
peek at which object are using the memory.

-XX:+HeapDumpOnOutOfMemoryError

And if you don't want to wait six days for another failure, you can collect
a heap sooner with jmap -F.

-Xloggc:/path/to/where/to/put/the/gc.log
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintPromotionFailure

Cheers,
Lee



On Wed, Dec 18, 2013 at 6:52 PM, Shammi Jayasinghe <sha...@wso2.com> wrote:

> Hi,
>
>
> We are facing with a problem on Cassandra tuning. In that we have faced
> with following OOM scenario[1], after running the system for 6 days. We
> have tuned the cassandra with following values. These values also obtained
> by going through huge number of testing cycles. But still it has gone OOM.
> I would like to know if someone can help on identifying tuning parameters.
>
> In this server , we have given 6GB for the Xmx value and the total memory
> in the server is 8GB. Cassandra version is : apache-cassandra-1.2.4
>
> Tuning parameters:
>
> flush_largest_memtables_at: 0.5
>
> reduce_cache_sizes_at: 0.85
>
> reduce_cache_capacity_to: 0.6
>
> commitlog_total_space_in_mb: 16
>
> commitlog_segment_size_in_mb: 16
>
>
>
> As i mentioned in the above parameters ( Flush_largest_memtable_at to 0,5)
> , i feel that it has not be affected to the server. Is there any way that
> we can check whether it is affected as expected to the server ?
>
>
>
> [1]WARN 19:16:50,355 Heap is 0.9971737408184552 full.  You may need to
> reduce memtable and/or cache sizes.  Cassandra will now flush up to the
> two largest memtables to free up memory.  Adjust
> flush_largest_memtables_at threshold in cassandra.yaml if you don't want
> Cassandra to do this automatically
>
>  WARN 19:18:19,784 Flushing CFS(Keyspace='QpidKeySpace',
> ColumnFamily='DestinationSubscriptionsCountRow') to relieve memory pressure
>
> ERROR 19:20:50,316 Exception in thread Thread[ReadStage:63,5,main]
>
> java.lang.OutOfMemoryError: Java heap space
>
>         at java.nio.ByteBuffer.wrap(ByteBuffer.java:350)
>
>         at java.nio.ByteBuffer.wrap(ByteBuffer.java:373)
>
>         at
> org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:391)
>
>         at
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
>
>         at
> org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:371)
>
>         at
> org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:84)
>
>         at
> org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:73)
>
>         at
> org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:370)
>
>         at
> org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:325)
>
>         at
> org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:151)
>
>         at
> org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:48)
>
>         at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
>         at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
>         at
> org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:90)
>
>         at
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:171)
>
>         at
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>
>         at
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>
>         at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>
>         at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>
>         at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
>         at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
>         at
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>
>         at
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>
>         at
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>
>         at
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>
>         at
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>
>         at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>
>         at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>
>         at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
>
>         at org.apache.cassandra.db.Table.getRow(Table.java:347)
>
>         at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
>
>         at
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052)
>
>  INFO 19:20:51,397 Stop listening to thrift clients
>
> --
> Best Regards,
>
> *  Shammi Jayasinghe*
> Associate Tech Lead
> WSO2, Inc.; http://wso2.com,
> mobile: +94 71 4493085
>
>

Reply via email to