[ 
https://issues.apache.org/jira/browse/CASSANDRA-4782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4782:
--------------------------------------

    Attachment: 4782.txt

Thanks, Fabien.  I admit that I didn't realize at first the implications of the 
millis fix on existing sstable metadata.

Patch attached that bumps the sstable version to hf as a marker that we know 
metadata with that version has sane replay positions.  Replay positions from 
older metadata will be treated as NONE.

This will force a full replay the first restart on 1.1.6; afterwards, any newly 
flushed sstables will have the sane-replay-position marker and future restarts 
will not need to replay data unnecessarily.

What do you think?
                
> Commitlog not replayed after restart
> ------------------------------------
>
>                 Key: CASSANDRA-4782
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4782
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Fabien Rousseau
>            Assignee: Jonathan Ellis
>            Priority: Critical
>             Fix For: 1.1.6
>
>         Attachments: 4782.txt
>
>
> It seems that there are two corner cases where commitlog is not replayed 
> after a restart :
>  - After a reboot of a server + restart of cassandra (1.1.0 to 1.1.4)
>  - After doing an upgrade from cassandra 1.1.X to cassandra 1.1.5
> This is due to the fact that the commitlog segment id should always be an  
> incrementing number (see this condition : 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L247
>  )
> But this assertion can be broken :
> In the first case, it is generated by System.nanoTime() but it seems that 
> System.nanoTime() is using the boot time as the base/reference (at least on 
> java6 & linux), thus after a reboot, System.nanoTime() can return a lower 
> number than before the reboot (and the javadoc says the reference is a 
> relative point in time...)
> In the second case, this was introduced by #4601 (which changes 
> System.nanoTime() by System.currentTimeMillis() thus people starting with 
> 1.1.5 are safe)
> This could explain the following tickets : #4741 and #4481

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to