[ https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748365#comment-13748365 ]
Vijay commented on CASSANDRA-5911: ---------------------------------- 1) Even though the logs show Replaying, we will not replay anything since END_OF_SEGMENT_MARKER is placed in the beginning of the file. - We can improve the logging, so we don't print CL which we are skipping after reading first 4 bytes. 2) Only the active segment is replayed, even if we Flush the CL.... since we have not recycled. - One way I can think of to avoid replaying on active segment, with a performance hit is.... to have a metadata file which might hold the info on CF dirty writes if any (similar to CommitLogSegment#cfLastWrite, write for the first write on segment and remove for flush). > Commit logs are not removed after nodetool flush or nodetool drain > ------------------------------------------------------------------ > > Key: CASSANDRA-5911 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5911 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: J.B. Langston > Assignee: Vijay > Priority: Minor > Fix For: 2.0.1 > > > Commit logs are not removed after nodetool flush or nodetool drain. This can > lead to unnecessary commit log replay during startup. I've reproduced this > on Apache Cassandra 1.2.8. Usually this isn't much of an issue but on a > Solr-indexed column family in DSE, each replayed mutation has to be reindexed > which can make startup take a long time (on the order of 20-30 min). > Reproduction follows: > {code} > jblangston:bin jblangston$ ./cassandra > /dev/null > jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 20000000 > > /dev/null > jblangston:bin jblangston$ du -h ../commitlog > 576M ../commitlog > jblangston:bin jblangston$ nodetool flush > jblangston:bin jblangston$ du -h ../commitlog > 576M ../commitlog > jblangston:bin jblangston$ nodetool drain > jblangston:bin jblangston$ du -h ../commitlog > 576M ../commitlog > jblangston:bin jblangston$ pkill java > jblangston:bin jblangston$ du -h ../commitlog > 576M ../commitlog > jblangston:bin jblangston$ ./cassandra -f | grep Replaying > INFO 10:03:42,915 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log > INFO 10:03:42,922 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log > INFO 10:03:43,907 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log > INFO 10:03:43,907 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log > INFO 10:03:43,907 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log > INFO 10:03:43,908 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log > INFO 10:03:43,908 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log > INFO 10:03:43,908 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log > INFO 10:03:43,909 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log > INFO 10:03:43,909 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log > INFO 10:03:43,909 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log > INFO 10:03:43,910 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log > INFO 10:03:43,910 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log > INFO 10:03:43,911 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log > INFO 10:03:43,911 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log > INFO 10:03:43,911 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log > INFO 10:03:43,912 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log > INFO 10:03:43,912 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log > INFO 10:03:43,912 Replaying > /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira