[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526389#comment-14526389
 ] 

Stefania commented on CASSANDRA-7066:
-------------------------------------

Thanks for your feedback, [~benedict], [~JoshuaMcKenzie], [~krummas].

I've removed temporary files and descriptor types entirely with a small 
exception for rewriting metadata which still uses a temporary file. I also plan 
on implementing the standalone tool suggested by Marcus.

[~benedict]: assuming you want to be the reviewer, you can start with a quick 
first round if you have some spare time 
(https://github.com/stef1927/cassandra/commits/7066-8984-alt). I am still 
testing (some dtests are broken) and reviewing myself but you may want to take 
a look at the transaction log class, which is called *OperationLog*. I am not 
entirely sure if the integration of this class with the SSTableRewriter and 
SSTableWriter is what you had in mind or if it should be more tightly 
integrated.




> Simplify (and unify) cleanup of compaction leftovers
> ----------------------------------------------------
>
>                 Key: CASSANDRA-7066
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Stefania
>            Priority: Minor
>              Labels: compaction
>             Fix For: 3.x
>
>
> Currently we manage a list of in-progress compactions in a system table, 
> which we use to cleanup incomplete compactions when we're done. The problem 
> with this is that 1) it's a bit clunky (and leaves us in positions where we 
> can unnecessarily cleanup completed files, or conversely not cleanup files 
> that have been superceded); and 2) it's only used for a regular compaction - 
> no other compaction types are guarded in the same way, so can result in 
> duplication if we fail before deleting the replacements.
> I'd like to see each sstable store in its metadata its direct ancestors, and 
> on startup we simply delete any sstables that occur in the union of all 
> ancestor sets. This way as soon as we finish writing we're capable of 
> cleaning up any leftovers, so we never get duplication. It's also much easier 
> to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to