[ 
https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748365#comment-13748365
 ] 

Vijay commented on CASSANDRA-5911:
----------------------------------

1) Even though the logs show Replaying, we will not replay anything since 
END_OF_SEGMENT_MARKER is placed in the beginning of the file. 

- We can improve the logging, so we don't print CL which we are skipping after 
reading first 4 bytes. 

2) Only the active segment is replayed, even if we Flush the CL.... since we 
have not recycled. 

- One way I can think of to avoid replaying on active segment, with a 
performance hit is.... to have a metadata file which might hold the info on CF 
dirty writes if any (similar to CommitLogSegment#cfLastWrite, write for the 
first write on segment and remove for flush). 
                
> Commit logs are not removed after nodetool flush or nodetool drain
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-5911
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5911
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: J.B. Langston
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 2.0.1
>
>
> Commit logs are not removed after nodetool flush or nodetool drain. This can 
> lead to unnecessary commit log replay during startup.  I've reproduced this 
> on Apache Cassandra 1.2.8.  Usually this isn't much of an issue but on a 
> Solr-indexed column family in DSE, each replayed mutation has to be reindexed 
> which can make startup take a long time (on the order of 20-30 min).
> Reproduction follows:
> {code}
> jblangston:bin jblangston$ ./cassandra > /dev/null
> jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 20000000 > 
> /dev/null
> jblangston:bin jblangston$ du -h ../commitlog
> 576M  ../commitlog
> jblangston:bin jblangston$ nodetool flush
> jblangston:bin jblangston$ du -h ../commitlog
> 576M  ../commitlog
> jblangston:bin jblangston$ nodetool drain
> jblangston:bin jblangston$ du -h ../commitlog
> 576M  ../commitlog
> jblangston:bin jblangston$ pkill java
> jblangston:bin jblangston$ du -h ../commitlog
> 576M  ../commitlog
> jblangston:bin jblangston$ ./cassandra -f | grep Replaying
>  INFO 10:03:42,915 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
>  INFO 10:03:42,922 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log
>  INFO 10:03:43,907 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log
>  INFO 10:03:43,907 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log
>  INFO 10:03:43,907 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log
>  INFO 10:03:43,908 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log
>  INFO 10:03:43,908 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log
>  INFO 10:03:43,908 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log
>  INFO 10:03:43,909 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log
>  INFO 10:03:43,909 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log
>  INFO 10:03:43,909 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log
>  INFO 10:03:43,910 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log
>  INFO 10:03:43,910 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log
>  INFO 10:03:43,911 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log
>  INFO 10:03:43,911 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log
>  INFO 10:03:43,911 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log
>  INFO 10:03:43,912 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log
>  INFO 10:03:43,912 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log
>  INFO 10:03:43,912 Replaying 
> /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to