[ 
https://issues.apache.org/jira/browse/CASSANDRA-6666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078171#comment-14078171
 ] 

Vishal Mehta commented on CASSANDRA-6666:
-----------------------------------------

Hello Every,

Please pardon my ignorance, since I am writing first time in opensource bug 
report.

Recently I think I hit this bug because I saw similar symptoms in my 3 node 
cassandra setup. Where I am running a test with around 12K qps (inserts in 3 
different tables) with TTL set to 1 hour and keyspace has GC seconds set to 
14400 (4 hours).

So tests eventually runs to a point where Cassandra sees Tombstones more than 
100K and it crashes with following exception in 
/var/log/cassandra/cassandra.log.

{noformat}
ERROR 13:23:56,747 Scanned over 100000 tombstones in system.hints; query 
aborted (see tombstone_fail_threshold)
ERROR 13:23:56,962 Exception in thread Thread[HintedHandoff:1,1,main]
org.apache.cassandra.db.filter.TombstoneOverwhelmingException
        at 
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:202)
        at 
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122)
        at 
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
        at 
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
        at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
        at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
        at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
        at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376)
        at 
org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:373)
        at 
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:330)
        at 
org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:91)
        at 
org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:547)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)
 INFO 13:24:00,987 No gossip backlog; proceeding
{noformat}

*Note:* Is it plausible to keep GC seconds closer to TTLs? Also I could see one 
of the node deleted all the records from disk and freed up the space, where as 
other two nodes never deleted their tombstones.



> Avoid accumulating tombstones after partial hint replay
> -------------------------------------------------------
>
>                 Key: CASSANDRA-6666
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6666
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>              Labels: hintedhandoff
>             Fix For: 2.0.10
>
>         Attachments: 6666.txt, cassandra_system.log.debug.gz
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to