[ https://issues.apache.org/jira/browse/CASSANDRA-8798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aleksey Yeschenko resolved CASSANDRA-8798. ------------------------------------------ Resolution: Won't Fix Reviewer: (was: Ariel Weisberg) Fix Version/s: (was: 2.1.x) Closing as Won't Fix for now. If you find a new way to deal with it, please feel free to create a new ticket. > don't throw TombstoneOverwhelmingException during bootstrap > ----------------------------------------------------------- > > Key: CASSANDRA-8798 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8798 > Project: Cassandra > Issue Type: Bug > Reporter: mck > Attachments: 8798.txt > > > During bootstrap honouring tombstone_failure_threshold seems > counter-productive as the node is not serving requests so not protecting > anything. > Instead what happens is bootstrap fails, and a cluster that obviously needs > an extra node isn't getting it... > **History** > When adding a new node bootstrap process looks complete in that streaming is > finished, compactions finished, and all disk and cpu activity is calm. > But the node is still stuck in "joining" status. > The last stage in the bootstrapping process is the rebuilding of secondary > indexes. grepping the logs confirmed it failed during this stage. > {code}grep SecondaryIndexManager cassandra/logs/*{code} > To see what secondary index rebuilding was initiated > {code} > grep "index build of " cassandra/logs/* | awk -F" for data in " '{print $1}' > INFO 13:18:11,252 Submitting index build of addresses.unobfuscatedIndex > INFO 13:18:11,352 Submitting index build of Inbox.FINNBOXID_INDEX > INFO 23:03:54,758 Submitting index build of [events.collected_tbIndex, > events.real_tbIndex] > {code} > To get an idea of successful secondary index rebuilding > {code}grep "Index build of "cassandra/logs/* > INFO 13:18:11,263 Index build of addresses.unobfuscatedIndex complete > INFO 13:18:11,355 Index build of Inbox.FINNBOXID_INDEX complete > {code} > Looking closer at {{[events.collected_tbIndex, events.real_tbIndex]}} showed > the following stacktrace > {code} > ERROR [StreamReceiveTask:121] 2015-02-12 05:54:47,768 CassandraDaemon.java > (line 199) Exception in thread Thread[StreamReceiveTask:121,5,main] > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: > org.apache.cassandra.db.filter.TombstoneOverwhelmingException > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:413) > at > org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:142) > at > org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:130) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: > org.apache.cassandra.db.filter.TombstoneOverwhelmingException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:188) > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:409) > ... 7 more > Caused by: java.lang.RuntimeException: > org.apache.cassandra.db.filter.TombstoneOverwhelmingException > at > org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:160) > at > org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:143) > at org.apache.cassandra.db.Keyspace.indexRow(Keyspace.java:406) > at > org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62) > at > org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:834) > ... 5 more > Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException > at > org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:202) > at > org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) > at > org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) > at > org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) > at > org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376) > at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:333) > at > org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65) > at > org.apache.cassandra.service.pager.SliceQueryPager.queryNextPage(SliceQueryPager.java:85) > at > org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:88) > at > org.apache.cassandra.service.pager.SliceQueryPager.fetchPage(SliceQueryPager.java:35) > at > org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:154) > ... 9 more > {code} > To get past this i had to raise > org.apache.cassandra.db:type=StorageService.TombstoneFailureThreshold and > manually rebuild the index. Then restart the node with auto_bootstrap=false -- This message was sent by Atlassian JIRA (v6.3.4#6332)