[jira] [Comment Edited] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

Robert Coli (JIRA) Thu, 30 Jul 2015 16:09:19 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648448#comment-14648448
 ]


Robert Coli edited comment on CASSANDRA-7066 at 7/30/15 11:08 PM:
------------------------------------------------------------------

(apologies is this is sufficiently covered above in the giant list of comments, 
I see that [~tjake] mentions the "refresh" case and that there is related 
discussion, but the specifics don't seem addressed..)

[~krummas] and [~yukim] and I discussed, in IRC, the following edge case 
regarding storing of ancestors :

1) NODE A compacts sstables 1 and 2 into sstable 3. 3 gets ancestor value "1,2".
2) sstable is copied into NODE B's data directory and NODE B is restarted OR 
sstable is copied and NODE B runs "nodetool refresh" (which doesn't, afaik, 
reset ancestor information)
3) sstable 3 on NODE B incorrectly believes its ancestors are NODE B's sstables 
1 and 2.

Marcus's response was that we likely need a facility to remove ancestor 
information from sstables. 

I agree with the up-thread statement that both Refresh and LoadNewSSTables are 
likely to be used by experts, but AFAICT those experts still have a need to 
clear ancestor information from sstables which are moving between nodes.

Also, [~nickmbailey]'s question regarding cassandra's behavior on restart when 
it finds unexpected files in the data directory is a revisit of the 
resolved-WONTFIX CASSANDRA-6756.


was (Author: rcoli):
(apologies is this is sufficiently covered above in the giant list of comments, 
I see that [~tjake] mentions the "refresh" case and that there is related 
discussion, but the specifics don't seem addressed..)

[~krummas] and [~yukim] and I discussed, in IRC, the following edge case 
regarding storing of ancestors :

1) NODE A compacts sstables 1 and 2 into sstable 3. 3 gets ancestor value "1,2".
2) sstable is copied into NODE B's data directory and NODE B is restarted OR 
sstable is copied and NODE B runs "nodetool refresh" (which doesn't, afaik, 
reset ancestor information)
3) sstable 3 on NODE B incorrectly believes its ancestors are NODE B's sstables 
1 and 2.

Marcus's response was that we likely need a facility to remove ancestor 
information from sstables. 

I agree with the up-thread statement that both Refresh and LoadNewSSTables are 
likely to be used by experts, but AFAICT those experts still have a need to 
clear ancestor information from sstables which are moving between nodes.



> Simplify (and unify) cleanup of compaction leftovers
> ----------------------------------------------------
>
>                 Key: CASSANDRA-7066
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Stefania
>            Priority: Minor
>              Labels: benedict-to-commit, compaction
>             Fix For: 3.0 alpha 1
>
>         Attachments: 7066.txt
>
>
> Currently we manage a list of in-progress compactions in a system table, 
> which we use to cleanup incomplete compactions when we're done. The problem 
> with this is that 1) it's a bit clunky (and leaves us in positions where we 
> can unnecessarily cleanup completed files, or conversely not cleanup files 
> that have been superceded); and 2) it's only used for a regular compaction - 
> no other compaction types are guarded in the same way, so can result in 
> duplication if we fail before deleting the replacements.
> I'd like to see each sstable store in its metadata its direct ancestors, and 
> on startup we simply delete any sstables that occur in the union of all 
> ancestor sets. This way as soon as we finish writing we're capable of 
> cleaning up any leftovers, so we never get duplication. It's also much easier 
> to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

Reply via email to