[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526513#comment-14526513
 ] 

Benedict commented on CASSANDRA-7066:
-------------------------------------

As far as level of integration goes, this looks fine for a base on 8984. 
CASSANDRA-8568 will most likely merge this work with its new "db.lifecycle" 
package, either storing the OperationLog inside of its Transaction object, or 
merging the functionality directly, since the behaviours are quite interwoven. 
Since 8568 has not been review yet, I will merge the functionality after this 
patch gets committed, in case that has to be revised significantly, although 
perhaps you can chime in on review of that ticket as well.

A couple of things I've noticed (not doing a formal review just yet):
* {{OperationLog}} should probably implement Transactional
* {{OperationLog.cleanup()}} (or {{OperationLog.LogFile.delete()}}) should 
force a sync of the directory's file descriptor, to make sure there is a 
happens-before relationship between the two log-file deletions, else if the 
process crashes we may be left with the wrong one deleted and corrupt the 
sstable state
* It looks to me like the SSTableDeleteNotification approach is currently 
broken: at least a set of sstables we care about needs to be maintained to be 
checked against, however I would probably consider adding a Runnable to each 
replaced {{SSTableReader}} using {{runOnClose}}, containing a {{Ref}} to be 
released which, once all are released, removes the log file. This way it's 
tracked, debuggable, and safe against miscounting.
* SSTableReader.DescriptorTypeTidy and SSTableReader.GlobalTidy should be merged
* getTemporaryFiles should probably return a Set, for efficient testing in the 
list methods


> Simplify (and unify) cleanup of compaction leftovers
> ----------------------------------------------------
>
>                 Key: CASSANDRA-7066
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Stefania
>            Priority: Minor
>              Labels: compaction
>             Fix For: 3.x
>
>
> Currently we manage a list of in-progress compactions in a system table, 
> which we use to cleanup incomplete compactions when we're done. The problem 
> with this is that 1) it's a bit clunky (and leaves us in positions where we 
> can unnecessarily cleanup completed files, or conversely not cleanup files 
> that have been superceded); and 2) it's only used for a regular compaction - 
> no other compaction types are guarded in the same way, so can result in 
> duplication if we fail before deleting the replacements.
> I'd like to see each sstable store in its metadata its direct ancestors, and 
> on startup we simply delete any sstables that occur in the union of all 
> ancestor sets. This way as soon as we finish writing we're capable of 
> cleaning up any leftovers, so we never get duplication. It's also much easier 
> to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to