[ 
https://issues.apache.org/jira/browse/CASSANDRA-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422613#comment-13422613
 ] 

Robert Coli commented on CASSANDRA-1967:
----------------------------------------

After making the above update, I noticed Cassandra 1.0.10 flushing after 
replay. Given this experience clashing with my interpretation of the code, I 
conjectured that the flush must be deeper in the code paths than previous 
versions, and deeper than I read this time. I asked about this in #cassandra.

Per jbellis in #cassandra :

1) Explicit flush at the end of replay is by design.
2) The design goal in this case is to avoid multiple replay of the same log, if 
node crashes before replayed data is flushed.

I don't find 2) a compelling design goal, and believe it violates the principle 
of least surprise. 

The purpose of the commitlog is to hold the contents of memtables. In the case 
of a crash, I expect the commitlog replay process to result in the same 
memtables that my node contained before it crashed. If it then crashes again, I 
expect the same memtables to be replayed again. There may be some negative 
externalities to this repeated replay which are not currently clear to me, but 
I am relatively confident that being surprised by my memtable state is not one 
of them.

In my opinion, avoiding compaction as a side effect of restart/replay is, in 
contrast, a compelling design goal.

Significant production users appear to agree in CASSANDRA-2444 ("[Twitter has] 
ran into many times where we do not want compaction to run right away against 
CFs when booting up a node.") But the resolution of CASSANDRA-2444 ("If the 
node needs to compact, it will do so at the first flush, which is more likely 
to be staggered across the cluster") does not make sense if commitlog replay 
always ends with a flush. The logical result of both code paths appears the 
same : restart has a potential to trigger immediate compaction.

In summary... +1 for re-opening this ticket and making commit log replay not 
end with a flush.
                
> commit log replay shouldn't end with a flush
> --------------------------------------------
>
>                 Key: CASSANDRA-1967
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1967
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.3
>            Reporter: Robert Coli
>            Priority: Minor
>
> (Apologies in advance if there is some very compelling reason to flush after 
> replay, of which I am not currently aware. ;D)
> Currently, when a node restarts, the following sequence occurs :
> a) commitlog is replayed
> b) any memtables resulting from a) are flushed 
> c) a new commitlog is opened, new memtables are switched in
> ... (other stuff happens)
> d) node starts taking traffic
> This has side effects, perhaps most seriously the potential of triggering 
> compaction. As a node is likely to struggle performance-wise after 
> restarting, triggering compaction at that time seems like something we might 
> wish to avoid.
> I propose that the sequence be :
> a) commitlog is replayed
> b) a new commitlog is opened, new memtables are switched in 
> ... (other stuff happens)
> c) node starts taking traffic
> Looking through the relevant code, the only code that appears to depend on 
> this flush is at 
> src/java/org/apache/cassandra/db/commitlog/CommitLog.java:112 :
> "
>         // all old segments are recovered and deleted before CommitLog is 
> instantiated.
>         // All we need to do is create a new one.
>         segments.add(new CommitLogSegment());
> "
> Presumably this code would have to be refactored to be aware of the currently 
> open commitlog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to