[ https://issues.apache.org/jira/browse/CASSANDRA-10871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rafael Harutyunyan updated CASSANDRA-10871: ------------------------------------------- Environment: Linux cassandra1 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu Aug 13 22:55:16 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux; Java(TM) SE Runtime Environment (build 1.7.0_67-b01) was:Linux cassandra1 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu Aug 13 22:55:16 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux > MemtableFlushWriter blocks and no flushing happens > -------------------------------------------------- > > Key: CASSANDRA-10871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10871 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Local Write-Read Paths > Environment: Linux cassandra1 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu > Aug 13 22:55:16 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux; Java(TM) SE Runtime > Environment (build 1.7.0_67-b01) > Reporter: Rafael Harutyunyan > Priority: Critical > Fix For: 2.1.11 > > Attachments: full_thread_dump.txt > > > After some time MemtableFlushWriter thread blocks, resulting first full > filling of the FlushWriterQueue, than full filling of MutationStage queue. > After this 2 things might happen - Cassandra might drop the queued mutations > and everything becomes normal or it shuts down with insufficient HeapSpace. > Here is the thread dump. > {noformat} > "MemtableFlushWriter:3" - Thread t@2610 > java.lang.Thread.State: BLOCKED > at > org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:250) > - waiting to lock <f9dab27> (a > org.apache.cassandra.db.compaction.WrappingCompactionStrategy) owned by > "CompactionExecutor:51" t@2638 > at org.apache.cassandra.db.DataTracker.notifyAdded(DataTracker.java:518) > at > org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:178) > at > org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234) > at > org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1502) > at > org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Locked ownable synchronizers: > - locked <7ef8cd1b> (a java.util.concurrent.ThreadPoolExecutor$Worker) > "MemtableFlushWriter:4" - Thread t@2616 > java.lang.Thread.State: BLOCKED > at > org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:250) > - waiting to lock <f9dab27> (a > org.apache.cassandra.db.compaction.WrappingCompactionStrategy) owned by > "CompactionExecutor:51" t@2638 > at org.apache.cassandra.db.DataTracker.notifyAdded(DataTracker.java:518) > at > org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:178) > at > org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234) > at > org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1502) > at > org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Locked ownable synchronizers: > - locked <2f842d9b> (a java.util.concurrent.ThreadPoolExecutor$Worker) > {noformat} > and here are the tpsats > {noformat} > Pool Name Active Pending Completed Blocked All > time blocked > CounterMutationStage 0 0 0 0 > 0 > ReadStage 0 0 28 0 > 0 > RequestResponseStage 0 0 2020253 0 > 0 > MutationStage 32 63221 27858588 0 > 0 > ReadRepairStage 0 0 0 0 > 0 > GossipStage 0 0 16430 0 > 0 > CacheCleanupExecutor 0 0 0 0 > 0 > AntiEntropyStage 0 0 3008 0 > 0 > MigrationStage 0 0 0 0 > 0 > Sampler 0 0 0 0 > 0 > ValidationExecutor 0 0 1500 0 > 0 > CommitLogArchiver 0 0 0 0 > 0 > MiscStage 0 0 0 0 > 0 > MemtableFlushWriter 2 220 3531 0 > 0 > MemtableReclaimMemory 0 0 4277 0 > 0 > PendingRangeCalculator 0 0 22 0 > 0 > MemtablePostFlush 1 306 5186 0 > 0 > CompactionExecutor 36 142 5326 0 > 0 > InternalResponseStage 0 0 0 0 > 0 > HintedHandoff 0 0 13 0 > 0 > Message type Dropped > RANGE_SLICE 0 > READ_REPAIR 0 > PAGED_RANGE 0 > BINARY 0 > READ 0 > MUTATION 220352 > _TRACE 0 > REQUEST_RESPONSE 0 > COUNTER_MUTATION 0 > {noformat} > cfstats reports 12k++ sstables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)