[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

Nick Bailey (JIRA) Wed, 29 Jul 2015 12:13:39 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646618#comment-14646618
 ]


Nick Bailey commented on CASSANDRA-7066:
----------------------------------------

Thanks for the ping Jonathan. There is a lot to follow and digest here so let 
me just try to bring up my concerns as someone working on OpsCenter. Those 
concerns should fairly well represent any other tools trying to do 
backup/restore or even a user trying to do it manually.

>From what I have tried to read through, it sounds like most of the concerns 
>here are around cases where files/directories are manipulated manually rather 
>than through the provided tools. So hopefully I can safely be ignored :).

* The snapshot command should create a full backup of a keyspace/table on the 
node. The directories created from the snapshot should be all that is required 
to restore that keyspace/table on that node to the point in time that the 
snapshot was taken.
* A snapshot should be restorable either via the sstableloader tool or by 
manually copying the files from the snapshot in to place (given the same 
schema/topology). If copying the files into place manually, restarting the node 
or making an additional call to load the sstables may be required.
* When using the sstableloader tool I should be able to restore data taken from 
a snapshot regardless of what data exists on the node or is currently being 
written.

If we are all good on those points then I don't see any issues from my 
standpoint. [~jbellis] was there anything else you wanted to me to look at 
specifically?



> Simplify (and unify) cleanup of compaction leftovers
> ----------------------------------------------------
>
>                 Key: CASSANDRA-7066
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Stefania
>            Priority: Minor
>              Labels: benedict-to-commit, compaction
>             Fix For: 3.0 alpha 1
>
>         Attachments: 7066.txt
>
>
> Currently we manage a list of in-progress compactions in a system table, 
> which we use to cleanup incomplete compactions when we're done. The problem 
> with this is that 1) it's a bit clunky (and leaves us in positions where we 
> can unnecessarily cleanup completed files, or conversely not cleanup files 
> that have been superceded); and 2) it's only used for a regular compaction - 
> no other compaction types are guarded in the same way, so can result in 
> duplication if we fail before deleting the replacements.
> I'd like to see each sstable store in its metadata its direct ancestors, and 
> on startup we simply delete any sstables that occur in the union of all 
> ancestor sets. This way as soon as we finish writing we're capable of 
> cleaning up any leftovers, so we never get duplication. It's also much easier 
> to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

Reply via email to