[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627789#comment-14627789
 ] 

Benedict commented on CASSANDRA-7066:
-------------------------------------

Sorry for the slow response on this, but since it is quite critical to our 
correctness I wanted to take some time over it. Before we finally commit I will 
likely take another overall run through it all yet.

I've pushed some suggestions 
[here|https://github.com/belliottsmith/cassandra/tree/7066-suggestions]. Some 
minor stylistic suggestions, but there are a couple of core modifications:

* I've modified the {{TransactionLogs}} / {{LifecycleTransaction}} integration 
to match the definition of {{Transactional}} (although I have just noticed the 
{{Transactional}} class comments mismatch the {{commit}} method comments, which 
do cause lack of clarity - we should fix that before commit): 
** {{prepareToCommit}} should never make changes that are not able to be 
rolled-back during {{abort}}, and under the prior design AFAICT the log files 
on disk could be left representing a different state to the one in memory in 
the event of a failure after {{LifecycleTransaction.prepareToCommit}} completes 
successfully. The intention of {{Transactional}} was that in this situation, 
where error inducing behaviour can occur during the logical commit phase, these 
actions all be taken at-once, at the start of the {{commit}} method, and are 
used to determine success/failure - actually throwing the exception.
** Just as importantly, we now stick to one the main goals of 
{{Transactional}}, that each {{Transactional}} object in a graph should forward 
the calls on to the same methods in its child {{Transactional}} objects (the 
idea being it's easier to reason about)
* I've moved the parent directory descriptor sync to ensure it creates a 
happens-before edge between the file deletions and the log deletions
* I've commented that we need to introduce this also between the deletion of 
the *opposing* log file before we continue with any of the contents deletion.

Also, I'd like to propose we hide {{TransactionLogs}} a little, by making its 
class constructor package-private, and ensuring it only ever exists as part of 
a {{LifecycleTransaction}}. I think this would make it much easier for _users_ 
of these classes to reason about things, since there's only one mechanism. 
{{Streaming}} and {{SSTableSimpleWriter}} can both safely use this AFAICT, and 
{{TransactionLogs}} can just expose a {{removeUnfinishedLeftovers}} (or we can 
do so in {{LifecycleTransaction}})

This also needs to be rebased, since dtests are now completely untestable 
(unfortunately cost a non-trivial amount of time to figure out this was because 
of schema changes that have been committed, and the drivers expecting them to 
be present in 3.0)

> Simplify (and unify) cleanup of compaction leftovers
> ----------------------------------------------------
>
>                 Key: CASSANDRA-7066
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Stefania
>            Priority: Minor
>              Labels: benedict-to-commit, compaction
>             Fix For: 3.x
>
>         Attachments: 7066.txt
>
>
> Currently we manage a list of in-progress compactions in a system table, 
> which we use to cleanup incomplete compactions when we're done. The problem 
> with this is that 1) it's a bit clunky (and leaves us in positions where we 
> can unnecessarily cleanup completed files, or conversely not cleanup files 
> that have been superceded); and 2) it's only used for a regular compaction - 
> no other compaction types are guarded in the same way, so can result in 
> duplication if we fail before deleting the replacements.
> I'd like to see each sstable store in its metadata its direct ancestors, and 
> on startup we simply delete any sstables that occur in the union of all 
> ancestor sets. This way as soon as we finish writing we're capable of 
> cleaning up any leftovers, so we never get duplication. It's also much easier 
> to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to