[ https://issues.apache.org/jira/browse/CASSANDRA-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14312681#comment-14312681 ]
Benedict commented on CASSANDRA-8766: ------------------------------------- I would rather not special case Windows, but we _should_ ensure we never "open early' unless we really are opening early. But the real question is how are we best to deal with the problem of stopping part way through a compaction. It seems like the cleanup isn't working correctly, which is why we're opening as a TMPLINK file here, since this wasn't the case when we first introduced early opening (it would just open as a final file). It would be by far easiest to return to opening as a final file and ensure the cleanup happens correctly. bq. One thing I was never really clear on: w/regards to the early re-open mechanics, is there a reason we create hard-links to the SSTable we're writing and rely on the file-system (renaming) for tracking which files are successfully compacted Well, we were trying to maintain the semantics that a file only gets renamed to a FINAL descriptor once it's actually complete. The TMPLINK files are an indirection to the TMP files (which will get deleted immediately on completion) and the FINAL files (which they are renamed to). The only way to ensure it can work without interruption is to use a hard link. Of course an alternative would be to avoid using any TMP files, and to always write directly to the FINAL descriptor, in which case the indirection is unnecessary. It shouldn't in principle be too difficult to do, but there may be assumptions buried in the code somewhere about temporary files. We also need to ensure our cleanup as actually robust, since it isn't working right now. It may be worth attempting it, but probably for 3.0. bq. On a related note (but likely different ticket) - I get the feeling we could better separate concerns w/how we handle FinishTypes in SSTableWriter. Currently the casing in finish() and close() feels like we're trying to shove multiple branching logical paths into a single method which tends towards brittleness and is also harder to reason about than is necessary. I'm pretty sure I have the exact opposite view here. We recently merged this into a single method because it was impossible to see what each of the special cases were for each of the four other highly similar methods. The majority of the code was repeated, which was bad, but it was mostly just not at all clear how behaviour differed whereas now it's clearly delineated. > SSTableRewriter opens all sstables as early before completing the compaction > ---------------------------------------------------------------------------- > > Key: CASSANDRA-8766 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8766 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Benedict > Priority: Minor > Fix For: 2.1.4 > > > In CASSANDRA-8320, we made the rewriter call switchWriter() inside of > finish(); in CASSANDRA-8124 was made switchWriter() open its data as EARLY. > This combination means we no longer honour disabling of early opening, which > is potentially a problem on windows for the deletion of the contents (which > is why we disable early opening on Windows). > I've commented on CASSANDRA-8124, as I suspect I'm missing something about > this. Although I have no doubt the old behaviour of opening as TMP file > reduced the window for problems, and opening as TMPLINK now does the same, > it's not entirely clear to me its the right fix (though it may be) since we > shouldn't be susceptible to this window anyway? Either way, we perhaps need > to come up with something else, because this could potentially break windows > support. Perhaps if we simply did not swap in the TMPLINK file so that it > never actually get mapped, it would perhaps be enough. [~JoshuaMcKenzie], > WDYT? -- This message was sent by Atlassian JIRA (v6.3.4#6332)