[ https://issues.apache.org/jira/browse/CASSANDRA-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776493#comment-13776493 ]
Jonathan Ellis commented on CASSANDRA-6079: ------------------------------------------- I think you're missing that we need to replay not just for coordinator failure but for replica failure. So "only scan for last couple seconds" is not correct; you need to scan every partition since the replica went down. (Note that for typical deployments, replica failure will be 3x more likely than coordinator failure.) > Memtables flush is delayed when having a lot of batchlog activity, making > node OOM > ---------------------------------------------------------------------------------- > > Key: CASSANDRA-6079 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6079 > Project: Cassandra > Issue Type: Bug > Reporter: Oleg Anastasyev > Assignee: Oleg Anastasyev > Priority: Minor > Fix For: 1.2.11, 2.0.2 > > Attachments: NoWaitBatchlogCompaction.diff > > > Both MeteredFlusher and BatchlogManager share the same OptionalTasks thread. > So, when batchlog manager processes its tasks no flushes can occur. Even > more, batchlog manager waits for batchlog CF compaction to finish. > On a lot of batchlog activity this prevents memtables from flush for a long > time, making the node OOM. > Fixed this by moving batchlog to its own thread and not waiting for batchlog > compaction to finish. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira