[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Semb Wever updated CASSANDRA-17955: --- Fix Version/s: (was: 4.1-rc) > Race condition on repair snapshots > -- > > Key: CASSANDRA-17955 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17955 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Local/Snapshots >Reporter: Cameron Zemek >Assignee: Stefan Miklosovic >Priority: Normal > Labels: 4.0 > Fix For: 4.0.8, 4.1-rc1, 4.1, 4.2 > > Attachments: signature.asc > > Time Spent: 1h > Remaining Estimate: 0h > > If an endpoint is convicted and that endpoint is a coordinator then > ActiveRepairService::removeParentRepairSession is called. > The issue is that this occurs on clearSnapshotExecutor and can happen while > RepairMessageVerbHandler is in process of taking a snapshot. So then you get > a race condition and clearSnapshot will throw a > java.nio.file.DirectoryNotEmptyException > > {code:java} > public static void deleteRecursiveWithThrottle(File dir, RateLimiter > rateLimiter) > { > if (dir.isDirectory()) > { > String[] children = dir.list(); > for (String child : children) > deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); > } > // The directory is now empty so now it can be smoked > deleteWithConfirmWithThrottle(dir, rateLimiter); > } {code} > Due to the directory not being empty when it goes to remove the directory at > the end. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Semb Wever updated CASSANDRA-17955: --- Fix Version/s: 4.1-rc1 > Race condition on repair snapshots > -- > > Key: CASSANDRA-17955 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17955 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Local/Snapshots >Reporter: Cameron Zemek >Assignee: Stefan Miklosovic >Priority: Normal > Labels: 4.0 > Fix For: 4.0.8, 4.1-rc1, 4.1-rc, 4.1, 4.2 > > Attachments: signature.asc > > Time Spent: 1h > Remaining Estimate: 0h > > If an endpoint is convicted and that endpoint is a coordinator then > ActiveRepairService::removeParentRepairSession is called. > The issue is that this occurs on clearSnapshotExecutor and can happen while > RepairMessageVerbHandler is in process of taking a snapshot. So then you get > a race condition and clearSnapshot will throw a > java.nio.file.DirectoryNotEmptyException > > {code:java} > public static void deleteRecursiveWithThrottle(File dir, RateLimiter > rateLimiter) > { > if (dir.isDirectory()) > { > String[] children = dir.list(); > for (String child : children) > deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); > } > // The directory is now empty so now it can be smoked > deleteWithConfirmWithThrottle(dir, rateLimiter); > } {code} > Due to the directory not being empty when it goes to remove the directory at > the end. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Semb Wever updated CASSANDRA-17955: --- Fix Version/s: 4.1 > Race condition on repair snapshots > -- > > Key: CASSANDRA-17955 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17955 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Local/Snapshots >Reporter: Cameron Zemek >Assignee: Stefan Miklosovic >Priority: Normal > Labels: 4.0 > Fix For: 4.0.8, 4.1-rc, 4.1, 4.2 > > Attachments: signature.asc > > Time Spent: 1h > Remaining Estimate: 0h > > If an endpoint is convicted and that endpoint is a coordinator then > ActiveRepairService::removeParentRepairSession is called. > The issue is that this occurs on clearSnapshotExecutor and can happen while > RepairMessageVerbHandler is in process of taking a snapshot. So then you get > a race condition and clearSnapshot will throw a > java.nio.file.DirectoryNotEmptyException > > {code:java} > public static void deleteRecursiveWithThrottle(File dir, RateLimiter > rateLimiter) > { > if (dir.isDirectory()) > { > String[] children = dir.list(); > for (String child : children) > deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); > } > // The directory is now empty so now it can be smoked > deleteWithConfirmWithThrottle(dir, rateLimiter); > } {code} > Due to the directory not being empty when it goes to remove the directory at > the end. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-17955: -- Fix Version/s: 4.0.8 4.2 (was: 4.x) (was: 4.0.x) Since Version: 4.0 Source Control Link: https://github.com/apache/cassandra/commit/35ef5b99577ef8b04b8d4b326154775f510ade42 Resolution: Fixed Status: Resolved (was: Ready to Commit) > Race condition on repair snapshots > -- > > Key: CASSANDRA-17955 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17955 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Local/Snapshots >Reporter: Cameron Zemek >Assignee: Stefan Miklosovic >Priority: Normal > Labels: 4.0 > Fix For: 4.0.8, 4.1-rc, 4.2 > > Attachments: signature.asc > > Time Spent: 50m > Remaining Estimate: 0h > > If an endpoint is convicted and that endpoint is a coordinator then > ActiveRepairService::removeParentRepairSession is called. > The issue is that this occurs on clearSnapshotExecutor and can happen while > RepairMessageVerbHandler is in process of taking a snapshot. So then you get > a race condition and clearSnapshot will throw a > java.nio.file.DirectoryNotEmptyException > > {code:java} > public static void deleteRecursiveWithThrottle(File dir, RateLimiter > rateLimiter) > { > if (dir.isDirectory()) > { > String[] children = dir.list(); > for (String child : children) > deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); > } > // The directory is now empty so now it can be smoked > deleteWithConfirmWithThrottle(dir, rateLimiter); > } {code} > Due to the directory not being empty when it goes to remove the directory at > the end. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-17955: -- Status: Review In Progress (was: Needs Committer) > Race condition on repair snapshots > -- > > Key: CASSANDRA-17955 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17955 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Local/Snapshots >Reporter: Cameron Zemek >Assignee: Stefan Miklosovic >Priority: Normal > Labels: 4.0 > Fix For: 4.0.x, 4.1-rc, 4.x > > Attachments: signature.asc > > Time Spent: 50m > Remaining Estimate: 0h > > If an endpoint is convicted and that endpoint is a coordinator then > ActiveRepairService::removeParentRepairSession is called. > The issue is that this occurs on clearSnapshotExecutor and can happen while > RepairMessageVerbHandler is in process of taking a snapshot. So then you get > a race condition and clearSnapshot will throw a > java.nio.file.DirectoryNotEmptyException > > {code:java} > public static void deleteRecursiveWithThrottle(File dir, RateLimiter > rateLimiter) > { > if (dir.isDirectory()) > { > String[] children = dir.list(); > for (String child : children) > deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); > } > // The directory is now empty so now it can be smoked > deleteWithConfirmWithThrottle(dir, rateLimiter); > } {code} > Due to the directory not being empty when it goes to remove the directory at > the end. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-17955: -- Status: Needs Committer (was: Patch Available) > Race condition on repair snapshots > -- > > Key: CASSANDRA-17955 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17955 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Local/Snapshots >Reporter: Cameron Zemek >Assignee: Stefan Miklosovic >Priority: Normal > Labels: 4.0 > Fix For: 4.0.x, 4.1-rc, 4.x > > Attachments: signature.asc > > Time Spent: 50m > Remaining Estimate: 0h > > If an endpoint is convicted and that endpoint is a coordinator then > ActiveRepairService::removeParentRepairSession is called. > The issue is that this occurs on clearSnapshotExecutor and can happen while > RepairMessageVerbHandler is in process of taking a snapshot. So then you get > a race condition and clearSnapshot will throw a > java.nio.file.DirectoryNotEmptyException > > {code:java} > public static void deleteRecursiveWithThrottle(File dir, RateLimiter > rateLimiter) > { > if (dir.isDirectory()) > { > String[] children = dir.list(); > for (String child : children) > deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); > } > // The directory is now empty so now it can be smoked > deleteWithConfirmWithThrottle(dir, rateLimiter); > } {code} > Due to the directory not being empty when it goes to remove the directory at > the end. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-17955: -- Status: Ready to Commit (was: Review In Progress) > Race condition on repair snapshots > -- > > Key: CASSANDRA-17955 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17955 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Local/Snapshots >Reporter: Cameron Zemek >Assignee: Stefan Miklosovic >Priority: Normal > Labels: 4.0 > Fix For: 4.0.x, 4.1-rc, 4.x > > Attachments: signature.asc > > Time Spent: 50m > Remaining Estimate: 0h > > If an endpoint is convicted and that endpoint is a coordinator then > ActiveRepairService::removeParentRepairSession is called. > The issue is that this occurs on clearSnapshotExecutor and can happen while > RepairMessageVerbHandler is in process of taking a snapshot. So then you get > a race condition and clearSnapshot will throw a > java.nio.file.DirectoryNotEmptyException > > {code:java} > public static void deleteRecursiveWithThrottle(File dir, RateLimiter > rateLimiter) > { > if (dir.isDirectory()) > { > String[] children = dir.list(); > for (String child : children) > deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); > } > // The directory is now empty so now it can be smoked > deleteWithConfirmWithThrottle(dir, rateLimiter); > } {code} > Due to the directory not being empty when it goes to remove the directory at > the end. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] miklosovic updated CASSANDRA-17955: --- Attachment: signature.asc I do not have a reproducer yet. I will try to do one but my gut feeling is that it wont be so easy. I checked the tests and I think you did one for repairs when one node went down. I might take that as a base and refactor it maybe. Sent from ProtonMail mobile \ > Race condition on repair snapshots > -- > > Key: CASSANDRA-17955 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17955 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Local/Snapshots >Reporter: Cameron Zemek >Assignee: Stefan Miklosovic >Priority: Normal > Labels: 4.0 > Fix For: 4.0.x, 4.1-rc, 4.x > > Attachments: signature.asc > > Time Spent: 0.5h > Remaining Estimate: 0h > > If an endpoint is convicted and that endpoint is a coordinator then > ActiveRepairService::removeParentRepairSession is called. > The issue is that this occurs on clearSnapshotExecutor and can happen while > RepairMessageVerbHandler is in process of taking a snapshot. So then you get > a race condition and clearSnapshot will throw a > java.nio.file.DirectoryNotEmptyException > > {code:java} > public static void deleteRecursiveWithThrottle(File dir, RateLimiter > rateLimiter) > { > if (dir.isDirectory()) > { > String[] children = dir.list(); > for (String child : children) > deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); > } > // The directory is now empty so now it can be smoked > deleteWithConfirmWithThrottle(dir, rateLimiter); > } {code} > Due to the directory not being empty when it goes to remove the directory at > the end. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-17955: -- Test and Documentation Plan: ci Status: Patch Available (was: In Progress) This might be the solution (1). Basically, we need to make sure that a new snapshot is not taken until snapshots are cleared. Executor in ActiveRepairService is running snapshot cleanup in a non-blocking way. That executor can run 1 thread only at any given time. CassandraTableRepairManager takes an emphemeral snapshot and it might race in ActiveRepairService as a snapshot is being cleared but it expects the directory to be empty - but it is not, because CassandraTableRepairManager created a snapshot in it. (1) https://github.com/apache/cassandra/pull/1903/files > Race condition on repair snapshots > -- > > Key: CASSANDRA-17955 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17955 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Local/Snapshots >Reporter: Cameron Zemek >Assignee: Stefan Miklosovic >Priority: Normal > Labels: 4.0 > Fix For: 4.0.x, 4.1-rc, 4.x > > Time Spent: 10m > Remaining Estimate: 0h > > If an endpoint is convicted and that endpoint is a coordinator then > ActiveRepairService::removeParentRepairSession is called. > The issue is that this occurs on clearSnapshotExecutor and can happen while > RepairMessageVerbHandler is in process of taking a snapshot. So then you get > a race condition and clearSnapshot will throw a > java.nio.file.DirectoryNotEmptyException > > {code:java} > public static void deleteRecursiveWithThrottle(File dir, RateLimiter > rateLimiter) > { > if (dir.isDirectory()) > { > String[] children = dir.list(); > for (String child : children) > deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); > } > // The directory is now empty so now it can be smoked > deleteWithConfirmWithThrottle(dir, rateLimiter); > } {code} > Due to the directory not being empty when it goes to remove the directory at > the end. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-17955: -- Fix Version/s: 4.0.x 4.1-rc 4.x > Race condition on repair snapshots > -- > > Key: CASSANDRA-17955 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17955 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Local/Snapshots >Reporter: Cameron Zemek >Assignee: Stefan Miklosovic >Priority: Normal > Labels: 4.0 > Fix For: 4.0.x, 4.1-rc, 4.x > > > If an endpoint is convicted and that endpoint is a coordinator then > ActiveRepairService::removeParentRepairSession is called. > The issue is that this occurs on clearSnapshotExecutor and can happen while > RepairMessageVerbHandler is in process of taking a snapshot. So then you get > a race condition and clearSnapshot will throw a > java.nio.file.DirectoryNotEmptyException > > {code:java} > public static void deleteRecursiveWithThrottle(File dir, RateLimiter > rateLimiter) > { > if (dir.isDirectory()) > { > String[] children = dir.list(); > for (String child : children) > deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); > } > // The directory is now empty so now it can be smoked > deleteWithConfirmWithThrottle(dir, rateLimiter); > } {code} > Due to the directory not being empty when it goes to remove the directory at > the end. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-17955: -- Bug Category: Parent values: Correctness(12982) Complexity: Normal Component/s: Consistency/Repair Local/Snapshots Discovered By: User Report Severity: Normal Status: Open (was: Triage Needed) > Race condition on repair snapshots > -- > > Key: CASSANDRA-17955 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17955 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Local/Snapshots >Reporter: Cameron Zemek >Assignee: Stefan Miklosovic >Priority: Normal > Labels: 4.0 > > If an endpoint is convicted and that endpoint is a coordinator then > ActiveRepairService::removeParentRepairSession is called. > The issue is that this occurs on clearSnapshotExecutor and can happen while > RepairMessageVerbHandler is in process of taking a snapshot. So then you get > a race condition and clearSnapshot will throw a > java.nio.file.DirectoryNotEmptyException > > {code:java} > public static void deleteRecursiveWithThrottle(File dir, RateLimiter > rateLimiter) > { > if (dir.isDirectory()) > { > String[] children = dir.list(); > for (String child : children) > deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); > } > // The directory is now empty so now it can be smoked > deleteWithConfirmWithThrottle(dir, rateLimiter); > } {code} > Due to the directory not being empty when it goes to remove the directory at > the end. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cameron Zemek updated CASSANDRA-17955: -- Labels: 4.0 (was: ) > Race condition on repair snapshots > -- > > Key: CASSANDRA-17955 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17955 > Project: Cassandra > Issue Type: Bug >Reporter: Cameron Zemek >Priority: Normal > Labels: 4.0 > > If an endpoint is convicted and that endpoint is a coordinator then > ActiveRepairService:: > removeParentRepairSession is called. > The issue is that this occurs on clearSnapshotExecutor and can happen while > RepairMessageVerbHandler is in process of taking a snapshot. So then you get > a race condition and clearSnapshot will throw a > java.nio.file.DirectoryNotEmptyException > > {code:java} > public static void deleteRecursiveWithThrottle(File dir, RateLimiter > rateLimiter) > { > if (dir.isDirectory()) > { > String[] children = dir.list(); > for (String child : children) > deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); > } > // The directory is now empty so now it can be smoked > deleteWithConfirmWithThrottle(dir, rateLimiter); > } {code} > Due to the directory not being empty when it goes to remove the directory at > the end. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cameron Zemek updated CASSANDRA-17955: -- Description: If an endpoint is convicted and that endpoint is a coordinator then ActiveRepairService::removeParentRepairSession is called. The issue is that this occurs on clearSnapshotExecutor and can happen while RepairMessageVerbHandler is in process of taking a snapshot. So then you get a race condition and clearSnapshot will throw a java.nio.file.DirectoryNotEmptyException {code:java} public static void deleteRecursiveWithThrottle(File dir, RateLimiter rateLimiter) { if (dir.isDirectory()) { String[] children = dir.list(); for (String child : children) deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); } // The directory is now empty so now it can be smoked deleteWithConfirmWithThrottle(dir, rateLimiter); } {code} Due to the directory not being empty when it goes to remove the directory at the end. was: If an endpoint is convicted and that endpoint is a coordinator then ActiveRepairService:: removeParentRepairSession is called. The issue is that this occurs on clearSnapshotExecutor and can happen while RepairMessageVerbHandler is in process of taking a snapshot. So then you get a race condition and clearSnapshot will throw a java.nio.file.DirectoryNotEmptyException {code:java} public static void deleteRecursiveWithThrottle(File dir, RateLimiter rateLimiter) { if (dir.isDirectory()) { String[] children = dir.list(); for (String child : children) deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); } // The directory is now empty so now it can be smoked deleteWithConfirmWithThrottle(dir, rateLimiter); } {code} Due to the directory not being empty when it goes to remove the directory at the end. > Race condition on repair snapshots > -- > > Key: CASSANDRA-17955 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17955 > Project: Cassandra > Issue Type: Bug >Reporter: Cameron Zemek >Priority: Normal > Labels: 4.0 > > If an endpoint is convicted and that endpoint is a coordinator then > ActiveRepairService::removeParentRepairSession is called. > The issue is that this occurs on clearSnapshotExecutor and can happen while > RepairMessageVerbHandler is in process of taking a snapshot. So then you get > a race condition and clearSnapshot will throw a > java.nio.file.DirectoryNotEmptyException > > {code:java} > public static void deleteRecursiveWithThrottle(File dir, RateLimiter > rateLimiter) > { > if (dir.isDirectory()) > { > String[] children = dir.list(); > for (String child : children) > deleteRecursiveWithThrottle(new File(dir, child), rateLimiter); > } > // The directory is now empty so now it can be smoked > deleteWithConfirmWithThrottle(dir, rateLimiter); > } {code} > Due to the directory not being empty when it goes to remove the directory at > the end. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org