[ https://issues.apache.org/jira/browse/CASSANDRA-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151416#comment-15151416 ]
Jeff Jirsa edited comment on CASSANDRA-11179 at 2/17/16 11:38 PM: ------------------------------------------------------------------ Also true of scrub. One other side-effect of parallelization worth noting is that source files are not immediately freed upon completion of each individual sstable - if you have n concurrent compactors, and 1 sstable is significantly smaller than the others, it will be finished very quickly, but there will exist a significant period of time when both the original source and resulting cleaned sstable will co-exist on disk (until all n are done?). That is, it appears that current parallel code waits for all in-flight tasks to complete before finalizing, and because those tasks run at different speed, operators are that much more likely to run out of disk during cleanup. was (Author: jjirsa): Also true of scrub. One other side-effect of parallelization worth noting is that source files are not immediately freed upon completion of each individual sstable - if you have 8 concurrent compactors, and 1 sstable is significantly smaller than the others, it will be finished very quickly, but there will exist a significant period of time when both the original source and resulting cleaned sstable will co-exist on disk. That is, it appears that current parallel code waits for all in-flight tasks to complete before finalizing, and because those tasks run at different speed, operators are that much more likely to run out of disk during cleanup. > Parallel cleanup can lead to disk space exhaustion > -------------------------------------------------- > > Key: CASSANDRA-11179 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11179 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Tools > Reporter: Tyler Hobbs > > In CASSANDRA-5547, we made cleanup (among other things) run in parallel > across multiple sstables. There have been reports on IRC of this leading to > disk space exhaustion, because multiple sstables are (almost entirely) > rewritten at the same time. This seems particularly problematic because > cleanup is frequently run after a cluster is expanded due to low disk space. > I'm not really familiar with how we perform free disk space checks now, but > it sounds like we can make some improvements here. It would be good to > reduce the concurrency of cleanup operations if there isn't enough free disk > space to support this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)