[ 
https://issues.apache.org/jira/browse/CASSANDRA-7720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14090234#comment-14090234
 ] 

Mike Schrag commented on CASSANDRA-7720:
----------------------------------------

Even with RAMP, I'm not sure that would cover you. In this particular case, the 
two inserts into the tables were not in the same atomic batch, even, so we're 
really just racing against the sstables on disk. We're just a hapless victim of 
the order that cass hard links the snapshot folder both among sstables in 
multiple column families on the same node as well as sstables in multiple 
column families across multiple nodes. In the end, even with RAMP, I believe 
you'd still have to make snapshotting commit-id-(or timestamp-)sensitive.

As an aside, i think this proposed extension to snapshotting would definitely 
be an option. Many people might not care at all that you get out-of-sequence 
snapshots, and you definitely would take a performance hit with the process I 
suggest up above.

> Add a more consistent snapshot mechanism
> ----------------------------------------
>
>                 Key: CASSANDRA-7720
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7720
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Mike Schrag
>
> We’ve hit an interesting issue with snapshotting, which makes sense in 
> hindsight, but presents an interesting challenge for consistent restores:
> * initiate snapshot
> * snapshotting flushes table A and takes the snapshot
> * insert into table A
> * insert into table B
> * snapshotting flushes table B and takes the snapshot
> * snapshot finishes
> So what happens here is that we end up having a B, but NOT having an A, even 
> though B was chronologically inserted after A.
> It makes sense when I think about what snapshot is doing, but I wonder if 
> snapshots actually should get a little fancier to behave a little more like 
> what I think most people would expect. What I think should happen is 
> something along the lines of the following:
> For each node:
> * pass a client timestamp in the snapshot call corresponding to "now"
> * snapshot the tables using the existing procedure
> * walk backwards through the linked snapshot sstables in that snapshot
>   * if the earliest update in that sstable is after the client's timestamp, 
> delete the sstable in the snapshot
>   * if the earliest update in the sstable is before the client's timestamp, 
> then look at the last update. Walk backwards through that sstable.
>     * if any updates fall after the timestamp, make a copy of that sstable in 
> the snapshot folder only up to the point of the timestamp and then delete the 
> original sstable in the snapshot (we need to copy because we're likely 
> holding a shared hard linked sstable)
> I think this would guarantee that you have a chronologically consistent view 
> of your snapshot across all machines and columnfamilies within a given 
> snapshot.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to