[ https://issues.apache.org/jira/browse/CASSANDRA-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13607826#comment-13607826 ]
Brandon Williams commented on CASSANDRA-5367: --------------------------------------------- It looks like hints aren't stuck, there's a thread trying to deliver to a host and there's a large compaction of hints going on. The host that the hints are for is the problem. > Hints stuck on compaction > ------------------------- > > Key: CASSANDRA-5367 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5367 > Project: Cassandra > Issue Type: Bug > Affects Versions: 1.2.2 > Environment: 80 Node cluster on 1.2.2 (problem has been around since > before 1.0) > Reporter: Brooke Bryan > Attachments: thread.log > > > When our cluster is handling hints, we will very often see hints get stuck on > nodes if it is unable to communicate with another node. The problem is not > that the other node is down, the other node will be sat doing compactions, or > running out of memory. While that node is a problem, and needs to be fixed, > all other nodes on the cluster will stick waiting to handle hints between > that node and itself. > This causes a pretty major knock on effect throughout the entire cluster, > causing hints to back up. We are seeing some nodes backed up with 14GB of > hints, after 2 days of the hints being stuck. > Also, during this "stuck" session, compactionstats will show a compaction on > the system hints column family, and not change the completed bytes amount. > This is the only reason for an entire cluster to get very bogged down from > what I have experienced, and requires a lot of manual intervention to get > everything back online. > After putting a node into debug mode, I have narrowed down the issue to be > within: > startColumn = hint.name(); (line ~361 HintedHandoffManager) and line 390 > based on the log output, and through pausing handoffs etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira