[ https://issues.apache.org/jira/browse/CASSANDRA-13213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15862589#comment-15862589 ]
Zachary Girouard edited comment on CASSANDRA-13213 at 2/12/17 12:16 AM: ------------------------------------------------------------------------ Some additional info. Right now the cluster is an 8 node setup, 32G memory, E5-2530 (10 core, 2.20Ghz). There are 4 10k RPM drives in use, 1 of them has the commit log and the other three are setup as JBOD. The primary use is storing time series data with kairosdb. The cluster was originally running 2.2 and consisted of three data centers, with the other two being roughly the same hardware but only 4 nodes each. I consolidated down to the biggest datacenter while trying to debug this issue. I saw no problems with 2.2 performance wise. I did have pretty bad disk use imbalance, which is why I tried upgrading things. One thing that may be relevant: the cluster was not being repaired consistently, although there are very few deletes occurring, The ones that do are usually overwriting old data points with newer values, but that doesn't occur frequently at all (I'd be shocked if more than 2% the data points were overwritten). Over time the JBOD disk use became very unbalanced, so I'd reached a point where there wasn't really enough headroom to do a repair. was (Author: zakk4223): Some additional info. Right now the cluster is an 8 node setup, 32G memory, E5-2530 (10 core, 2.20Ghz). There are 4 10k RPM drives in use, 1 of them has the commit log and the other three are setup as JBOD. The primary use is storing time series data with kairosdb. The cluster was originally running 2.2 and consisted of three data centers, with the other two being roughly the same hardware but only 4 nodes each. I consolidated down to the biggest datacenter while trying to debug this issue. I saw no problems with 2.2 performance wise. I did have pretty bad disk use imbalance, which is why I tried upgrading things. One thing that may be relevant: the cluster was not being repaired consistently, although there are very view deletes occurring, The ones that do are usually overwriting old data points with newer values, but that doesn't occur frequently at all (I'd be shocked if more than 2% the data points were overwritten). Over time the JBOD disk use became very unbalanced, so I'd reached a point where there wasn't really enough headroom to do a repair. > compactionstats not available, node eventually OOMs due to pending mutations > ---------------------------------------------------------------------------- > > Key: CASSANDRA-13213 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13213 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Local Write-Read Paths > Reporter: Zachary Girouard > > I'm seeing semi-frequent instances of nodetool compactionstats hanging > forever. While this is occurring none of the compaction metrics are available > via jmx/jconsole. > Sometimes the node will eventually recover, but I'm seeing a pattern where a > node will exhibit this behavior, and then eventually pending mutations start > piling up and the node dies due to OOM. Sometimes pending gossip operations > starting piling up too, but I think this is due to the impending OOM causing > everything to bog down. > As an experiment I turned auto compaction off on all the nodes and I haven't > seen this issue occur since I did that. Additionally, I'm running > relocatesstables on some nodes with unthrottled compaction and so far none of > them have had any issues handling it. > I managed to get some stack traces from a dying node: > All MutationStage threads look similar to this: > {noformat} > Name: MutationStage-10 > > State: WAITING > Total blocked: 9 Total waited: 1,959,850 > Stack trace: > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.park(Unknown Source) > org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:279) > org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.allocate(MemtableAllocator.java:162) > org.apache.cassandra.utils.memory.SlabAllocator.allocate(SlabAllocator.java:89) > org.apache.cassandra.utils.memory.ContextAllocator.allocate(ContextAllocator.java:57) > org.apache.cassandra.utils.memory.ContextAllocator.clone(ContextAllocator.java:47) > org.apache.cassandra.db.rows.BufferCell.copy(BufferCell.java:122) > org.apache.cassandra.utils.memory.AbstractAllocator$CloningBTreeRowBuilder.addCell(AbstractAllocator.java:72) > org.apache.cassandra.db.rows.Rows.copy(Rows.java:51) > org.apache.cassandra.db.partitions.AtomicBTreePartition$RowUpdater.apply(AtomicBTreePartition.java:332) > org.apache.cassandra.db.partitions.AtomicBTreePartition$RowUpdater.apply(AtomicBTreePartition.java:295) > org.apache.cassandra.utils.btree.NodeBuilder.addNewKey(NodeBuilder.java:323) > org.apache.cassandra.utils.btree.NodeBuilder.update(NodeBuilder.java:184) > org.apache.cassandra.utils.btree.TreeBuilder.update(TreeBuilder.java:95) > org.apache.cassandra.utils.btree.BTree.update(BTree.java:182) > org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:156) > org.apache.cassandra.db.Memtable.put(Memtable.java:284)org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1316) > org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:618) > org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:425) > org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:222) > org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:68) > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) > java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) > org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > java.lang.Thread.run(Unknown Source) > {noformat} > The Compaction threads > {noformat} > Name: CompactionExecutor:4 > State: RUNNABLE > Total blocked: 32,781,277 Total waited: 549 > Stack trace: > org.apache.cassandra.io.sstable.format.SSTableReader.getTotalBytes(SSTableReader.java:661) > org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(LeveledManifest.java:669) > org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:385) > - locked org.apache.cassandra.db.compaction.LeveledManifest@55c79600 > > org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:119) > > org.apache.cassandra.db.compaction.CompactionStrategyManager.getNextBackgroundTask(CompactionStrategyManager.java:119) > > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:261) > java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) > java.util.concurrent.FutureTask.run(Unknown Source) > java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) > java.util.concurrent.FutureTask.run(Unknown Source) > java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$4/45023307.run(Unknown > Source) > java.lang.Thread.run(Unknown Source) > {noformat} > {noformat} > Name: CompactionExecutor:1 > State: BLOCKED on > org.apache.cassandra.db.compaction.LeveledManifest@55c79600 owned by: > CompactionExecutor:2 > Total blocked: 116,196,349 Total waited: 4,771 > Stack trace: > > org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:310) > > org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:119) > > org.apache.cassandra.db.compaction.CompactionStrategyManager.getNextBackgroundTask(CompactionStrategyManager.java:119) > > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:261) > java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) > java.util.concurrent.FutureTask.run(Unknown Source) > java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) > java.util.concurrent.FutureTask.run(Unknown Source) > java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$4/45023307.run(Unknown > Source) > java.lang.Thread.run(Unknown Source) > {noformat} > {noformat} > Name: CompactionExecutor:2 > State: BLOCKED on > org.apache.cassandra.db.compaction.LeveledManifest@55c79600 owned by: > CompactionExecutor:1 > Total blocked: 26,542,909 Total waited: 4,373 > Stack trace: > > org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:310) > > org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:119) > > org.apache.cassandra.db.compaction.CompactionStrategyManager.getNextBackgroundTask(CompactionStrategyManager.java:119) > > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:261) > java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) > java.util.concurrent.FutureTask.run(Unknown Source) > java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) > java.util.concurrent.FutureTask.run(Unknown Source) > java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$4/45023307.run(Unknown > Source) > java.lang.Thread.run(Unknown Source) > {noformat} > Tp stats from the node > {noformat} > Pool Name Active Pending Completed Blocked All > time blocked > MutationStage 32 762545 53122841 0 > 0 > ViewMutationStage 0 0 0 0 > 0 > ReadStage 0 0 247792 0 > 0 > RequestResponseStage 0 0 200621 0 > 0 > ReadRepairStage 0 0 2489 0 > 0 > CounterMutationStage 0 0 0 0 > 0 > MiscStage 0 0 0 0 > 0 > CompactionExecutor 3 58 26816 0 > 0 > MemtableReclaimMemory 0 0 176 0 > 0 > PendingRangeCalculator 0 0 84 0 > 0 > GossipStage 0 0 235500 0 > 0 > SecondaryIndexManagement 0 0 0 0 > 0 > HintsDispatcher 0 0 749 0 > 0 > PerDiskMemtableFlushWriter_1 0 0 156 0 > 0 > PerDiskMemtableFlushWriter_2 0 0 156 0 > 0 > MigrationStage 1 25 73 0 > 0 > MemtablePostFlush 1 25 1953 0 > 0 > PerDiskMemtableFlushWriter_0 0 0 176 0 > 0 > ValidationExecutor 0 0 1320 0 > 0 > Sampler 0 0 0 0 > 0 > MemtableFlushWriter 1 18 176 0 > 0 > InternalResponseStage 0 0 3030 0 > 0 > AntiEntropyStage 0 0 4087 0 > 0 > CacheCleanupExecutor 0 0 0 0 > 0 > Native-Transport-Requests 0 0 670925 0 > 0 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)