[ https://issues.apache.org/jira/browse/CASSANDRA-13538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16501322#comment-16501322 ]
Jai Bheemsen Rao Dhanwada commented on CASSANDRA-13538: ------------------------------------------------------- Noticed the similar issue in one of the environments, has anyone have any workaround? > Cassandra tasks permanently block after the following assertion occurs during > compaction: "java.lang.AssertionError: Interval min > max " > ----------------------------------------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-13538 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13538 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: This happens on a 7 node system with 2 data centers. > We're using Cassandra version 2.1.15. I upgraded to 2.1.17 and it still > occurs. > Reporter: Andy Klages > Priority: Major > Fix For: 2.1.x > > Attachments: cassandra.yaml, jstack.out, schema.cql3, system.log, > tpstats.out > > > We noticed this problem because the commitlogs proliferate to the point that > we eventually run out of disk space. nodetool tpstats shows several of the > tasks backed up: > {code} > Pool Name Active Pending Completed Blocked All > time blocked > MutationStage 0 0 134335315 0 > 0 > ReadStage 0 0 643986790 0 > 0 > RequestResponseStage 0 0 114298 0 > 0 > ReadRepairStage 0 0 36 0 > 0 > CounterMutationStage 0 0 0 0 > 0 > MiscStage 0 0 0 0 > 0 > AntiEntropySessions 1 1 79357 0 > 0 > HintedHandoff 0 0 90 0 > 0 > GossipStage 0 0 6595098 0 > 0 > CacheCleanupExecutor 0 0 0 0 > 0 > InternalResponseStage 0 0 1638369 0 > 0 > CommitLogArchiver 0 0 0 0 > 0 > CompactionExecutor 2 175 2922542 0 > 0 > ValidationExecutor 0 0 1465374 0 > 0 > MigrationStage 1 76 600 0 > 0 > AntiEntropyStage 1 923 8291098 0 > 0 > PendingRangeCalculator 0 0 20 0 > 0 > Sampler 0 0 0 0 > 0 > MemtableFlushWriter 0 0 53017 0 > 0 > MemtablePostFlush 1 4584 1545141 0 > 0 > MemtableReclaimMemory 0 0 70639 0 > 0 > Native-Transport-Requests 0 0 352559 0 > 0 > {code} > This all starts after the following exception is raised in Cassandra: > {code} > ERROR [MemtableFlushWriter:2437] 2017-05-15 01:53:23,380 > CassandraDaemon.java:231 - Exception in thread > Thread[MemtableFlushWriter:2437,5,main] > java.lang.AssertionError: Interval min > max > at > org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:249) > ~[apache-cassandra-2.1.15.jar:2.1.15] > at org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72) > ~[apache-cassandra-2.1.15.jar:2.1.15] > at > org.apache.cassandra.db.DataTracker$SSTableIntervalTree.<init>(DataTracker.java:603) > ~[apache-cassandra-2.1.15.jar:2.1.15] > at > org.apache.cassandra.db.DataTracker$SSTableIntervalTree.<init>(DataTracker.java:597) > ~[apache-cassandra-2.1.15.jar:2.1.15] > at > org.apache.cassandra.db.DataTracker.buildIntervalTree(DataTracker.java:578) > ~[apache-cassandra-2.1.15.jar:2.1.15] > at > org.apache.cassandra.db.DataTracker$View.replaceFlushed(DataTracker.java:740) > ~[apache-cassandra-2.1.15.jar:2.1.15] > at > org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:172) > ~[apache-cassandra-2.1.15.jar:2.1.15] > at > org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234) > ~[apache-cassandra-2.1.15.jar:2.1.15] > at > org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1521) > ~[apache-cassandra-2.1.15.jar:2.1.15] > at > org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336) > ~[apache-cassandra-2.1.15.jar:2.1.15] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[apache-cassandra-2.1.15.jar:2.1.15] > at > com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) > ~[guava-16.0.jar:na] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1127) > ~[apache-cassandra-2.1.15.jar:2.1.15] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_121] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > ~[na:1.8.0_121] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121] > {code} > This has only occurred on one of our system tester's setup but with > regularity. I couldn't begin to tell you how to reproduce it. We have many > systems deployed only one this one setup encounters this issue. I have > included the jstack output, config file, log file, and schema. I even have a > heap dump available if needed. After looking at the heap dump, the best I can > tell is that the assertion failure left a lock (i.e. latch) in a locked state > that then causes a backlog of pending tasks. > I'm hoping this assertion will mean something to the Cassandra development > community and perhaps fixed in a newer release. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org