[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots

2022-11-22 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-17955:
---
Fix Version/s: (was: 4.1-rc)

> Race condition on repair snapshots
> --
>
> Key: CASSANDRA-17955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Snapshots
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0
> Fix For: 4.0.8, 4.1-rc1, 4.1, 4.2
>
> Attachments: signature.asc
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> If an endpoint is convicted and that endpoint is a coordinator then 
> ActiveRepairService::removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while 
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get 
> a race condition and clearSnapshot will throw a 
> java.nio.file.DirectoryNotEmptyException
>  
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
> rateLimiter)
> {
> if (dir.isDirectory())
> {
> String[] children = dir.list();
> for (String child : children)
> deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
> }
> // The directory is now empty so now it can be smoked
> deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at 
> the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots

2022-11-22 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-17955:
---
Fix Version/s: 4.1-rc1

> Race condition on repair snapshots
> --
>
> Key: CASSANDRA-17955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Snapshots
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0
> Fix For: 4.0.8, 4.1-rc1, 4.1-rc, 4.1, 4.2
>
> Attachments: signature.asc
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> If an endpoint is convicted and that endpoint is a coordinator then 
> ActiveRepairService::removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while 
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get 
> a race condition and clearSnapshot will throw a 
> java.nio.file.DirectoryNotEmptyException
>  
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
> rateLimiter)
> {
> if (dir.isDirectory())
> {
> String[] children = dir.list();
> for (String child : children)
> deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
> }
> // The directory is now empty so now it can be smoked
> deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at 
> the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots

2022-11-22 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-17955:
---
Fix Version/s: 4.1

> Race condition on repair snapshots
> --
>
> Key: CASSANDRA-17955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Snapshots
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0
> Fix For: 4.0.8, 4.1-rc, 4.1, 4.2
>
> Attachments: signature.asc
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> If an endpoint is convicted and that endpoint is a coordinator then 
> ActiveRepairService::removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while 
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get 
> a race condition and clearSnapshot will throw a 
> java.nio.file.DirectoryNotEmptyException
>  
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
> rateLimiter)
> {
> if (dir.isDirectory())
> {
> String[] children = dir.list();
> for (String child : children)
> deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
> }
> // The directory is now empty so now it can be smoked
> deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at 
> the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots

2022-10-27 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17955:
--
  Fix Version/s: 4.0.8
 4.2
 (was: 4.x)
 (was: 4.0.x)
  Since Version: 4.0
Source Control Link: 
https://github.com/apache/cassandra/commit/35ef5b99577ef8b04b8d4b326154775f510ade42
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Race condition on repair snapshots
> --
>
> Key: CASSANDRA-17955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Snapshots
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0
> Fix For: 4.0.8, 4.1-rc, 4.2
>
> Attachments: signature.asc
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> If an endpoint is convicted and that endpoint is a coordinator then 
> ActiveRepairService::removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while 
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get 
> a race condition and clearSnapshot will throw a 
> java.nio.file.DirectoryNotEmptyException
>  
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
> rateLimiter)
> {
> if (dir.isDirectory())
> {
> String[] children = dir.list();
> for (String child : children)
> deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
> }
> // The directory is now empty so now it can be smoked
> deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at 
> the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots

2022-10-27 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17955:
--
Status: Review In Progress  (was: Needs Committer)

> Race condition on repair snapshots
> --
>
> Key: CASSANDRA-17955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Snapshots
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0
> Fix For: 4.0.x, 4.1-rc, 4.x
>
> Attachments: signature.asc
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> If an endpoint is convicted and that endpoint is a coordinator then 
> ActiveRepairService::removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while 
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get 
> a race condition and clearSnapshot will throw a 
> java.nio.file.DirectoryNotEmptyException
>  
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
> rateLimiter)
> {
> if (dir.isDirectory())
> {
> String[] children = dir.list();
> for (String child : children)
> deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
> }
> // The directory is now empty so now it can be smoked
> deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at 
> the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots

2022-10-27 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17955:
--
Status: Needs Committer  (was: Patch Available)

> Race condition on repair snapshots
> --
>
> Key: CASSANDRA-17955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Snapshots
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0
> Fix For: 4.0.x, 4.1-rc, 4.x
>
> Attachments: signature.asc
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> If an endpoint is convicted and that endpoint is a coordinator then 
> ActiveRepairService::removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while 
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get 
> a race condition and clearSnapshot will throw a 
> java.nio.file.DirectoryNotEmptyException
>  
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
> rateLimiter)
> {
> if (dir.isDirectory())
> {
> String[] children = dir.list();
> for (String child : children)
> deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
> }
> // The directory is now empty so now it can be smoked
> deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at 
> the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots

2022-10-27 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17955:
--
Status: Ready to Commit  (was: Review In Progress)

> Race condition on repair snapshots
> --
>
> Key: CASSANDRA-17955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Snapshots
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0
> Fix For: 4.0.x, 4.1-rc, 4.x
>
> Attachments: signature.asc
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> If an endpoint is convicted and that endpoint is a coordinator then 
> ActiveRepairService::removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while 
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get 
> a race condition and clearSnapshot will throw a 
> java.nio.file.DirectoryNotEmptyException
>  
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
> rateLimiter)
> {
> if (dir.isDirectory())
> {
> String[] children = dir.list();
> for (String child : children)
> deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
> }
> // The directory is now empty so now it can be smoked
> deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at 
> the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots

2022-10-10 Thread miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

miklosovic updated CASSANDRA-17955:
---
Attachment: signature.asc

I do not have a reproducer yet. I will try to do one but my gut feeling is that 
it wont be so easy. I checked the tests and I think you did one for repairs 
when one node went down. I might take that as a base and refactor it maybe.


Sent from ProtonMail mobile



\

> Race condition on repair snapshots
> --
>
> Key: CASSANDRA-17955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Snapshots
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0
> Fix For: 4.0.x, 4.1-rc, 4.x
>
> Attachments: signature.asc
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If an endpoint is convicted and that endpoint is a coordinator then 
> ActiveRepairService::removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while 
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get 
> a race condition and clearSnapshot will throw a 
> java.nio.file.DirectoryNotEmptyException
>  
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
> rateLimiter)
> {
> if (dir.isDirectory())
> {
> String[] children = dir.list();
> for (String child : children)
> deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
> }
> // The directory is now empty so now it can be smoked
> deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at 
> the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots

2022-10-10 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17955:
--
Test and Documentation Plan: ci
 Status: Patch Available  (was: In Progress)

This might be the solution (1). Basically, we need to make sure that a new 
snapshot is not taken until snapshots are cleared. Executor in 
ActiveRepairService is running snapshot cleanup in a non-blocking way. That 
executor can run 1 thread only at any given time. 

CassandraTableRepairManager takes an emphemeral snapshot and it might race in 
ActiveRepairService as a snapshot is being cleared but it expects the directory 
to be empty - but it is not, because CassandraTableRepairManager created a 
snapshot in it. 

(1) https://github.com/apache/cassandra/pull/1903/files

> Race condition on repair snapshots
> --
>
> Key: CASSANDRA-17955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Snapshots
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0
> Fix For: 4.0.x, 4.1-rc, 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If an endpoint is convicted and that endpoint is a coordinator then 
> ActiveRepairService::removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while 
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get 
> a race condition and clearSnapshot will throw a 
> java.nio.file.DirectoryNotEmptyException
>  
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
> rateLimiter)
> {
> if (dir.isDirectory())
> {
> String[] children = dir.list();
> for (String child : children)
> deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
> }
> // The directory is now empty so now it can be smoked
> deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at 
> the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots

2022-10-10 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17955:
--
Fix Version/s: 4.0.x
   4.1-rc
   4.x

> Race condition on repair snapshots
> --
>
> Key: CASSANDRA-17955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Snapshots
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0
> Fix For: 4.0.x, 4.1-rc, 4.x
>
>
> If an endpoint is convicted and that endpoint is a coordinator then 
> ActiveRepairService::removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while 
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get 
> a race condition and clearSnapshot will throw a 
> java.nio.file.DirectoryNotEmptyException
>  
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
> rateLimiter)
> {
> if (dir.isDirectory())
> {
> String[] children = dir.list();
> for (String child : children)
> deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
> }
> // The directory is now empty so now it can be smoked
> deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at 
> the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots

2022-10-10 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17955:
--
 Bug Category: Parent values: Correctness(12982)
   Complexity: Normal
  Component/s: Consistency/Repair
   Local/Snapshots
Discovered By: User Report
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Race condition on repair snapshots
> --
>
> Key: CASSANDRA-17955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Snapshots
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: 4.0
>
> If an endpoint is convicted and that endpoint is a coordinator then 
> ActiveRepairService::removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while 
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get 
> a race condition and clearSnapshot will throw a 
> java.nio.file.DirectoryNotEmptyException
>  
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
> rateLimiter)
> {
> if (dir.isDirectory())
> {
> String[] children = dir.list();
> for (String child : children)
> deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
> }
> // The directory is now empty so now it can be smoked
> deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at 
> the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots

2022-10-09 Thread Cameron Zemek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cameron Zemek updated CASSANDRA-17955:
--
Labels: 4.0  (was: )

> Race condition on repair snapshots
> --
>
> Key: CASSANDRA-17955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>  Labels: 4.0
>
> If an endpoint is convicted and that endpoint is a coordinator then 
> ActiveRepairService::
> removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while 
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get 
> a race condition and clearSnapshot will throw a 
> java.nio.file.DirectoryNotEmptyException
>  
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
> rateLimiter)
> {
> if (dir.isDirectory())
> {
> String[] children = dir.list();
> for (String child : children)
> deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
> }
> // The directory is now empty so now it can be smoked
> deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at 
> the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17955) Race condition on repair snapshots

2022-10-09 Thread Cameron Zemek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cameron Zemek updated CASSANDRA-17955:
--
Description: 
If an endpoint is convicted and that endpoint is a coordinator then 
ActiveRepairService::removeParentRepairSession is called.

The issue is that this occurs on clearSnapshotExecutor and can happen while 
RepairMessageVerbHandler is in process of taking a snapshot. So then you get a 
race condition and clearSnapshot will throw a 
java.nio.file.DirectoryNotEmptyException

 
{code:java}
public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
rateLimiter)
{
if (dir.isDirectory())
{
String[] children = dir.list();
for (String child : children)
deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
}

// The directory is now empty so now it can be smoked
deleteWithConfirmWithThrottle(dir, rateLimiter);
} {code}
Due to the directory not being empty when it goes to remove the directory at 
the end.

  was:
If an endpoint is convicted and that endpoint is a coordinator then 
ActiveRepairService::

removeParentRepairSession is called.

The issue is that this occurs on clearSnapshotExecutor and can happen while 

RepairMessageVerbHandler is in process of taking a snapshot. So then you get a 
race condition and clearSnapshot will throw a 
java.nio.file.DirectoryNotEmptyException

 
{code:java}
public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
rateLimiter)
{
if (dir.isDirectory())
{
String[] children = dir.list();
for (String child : children)
deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
}

// The directory is now empty so now it can be smoked
deleteWithConfirmWithThrottle(dir, rateLimiter);
} {code}
Due to the directory not being empty when it goes to remove the directory at 
the end.


> Race condition on repair snapshots
> --
>
> Key: CASSANDRA-17955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>  Labels: 4.0
>
> If an endpoint is convicted and that endpoint is a coordinator then 
> ActiveRepairService::removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while 
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get 
> a race condition and clearSnapshot will throw a 
> java.nio.file.DirectoryNotEmptyException
>  
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
> rateLimiter)
> {
> if (dir.isDirectory())
> {
> String[] children = dir.list();
> for (String child : children)
> deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
> }
> // The directory is now empty so now it can be smoked
> deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at 
> the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org