[ 
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17623757#comment-17623757
 ] 

Stefan Miklosovic commented on CASSANDRA-17955:
-----------------------------------------------

I run pipeline with the proposed approach (running it in Stage) and while it 
seems to be ok in 4.0, it does not work in 4.1, there are dozens of errors, 
reproducible locally too. It seems like it is stuck and then it just timeouts. 

I do not have a lot of time to investigate what is going on here, I run the 
original patch and passes the pipeline just fine so I will go with that one.

I ll formally prepare all the branches and builds and merge the original and 
already approved approach.

> Race condition on repair snapshots
> ----------------------------------
>
>                 Key: CASSANDRA-17955
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair, Local/Snapshots
>            Reporter: Cameron Zemek
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>              Labels: 4.0
>             Fix For: 4.0.x, 4.1-rc, 4.x
>
>         Attachments: signature.asc
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> If an endpoint is convicted and that endpoint is a coordinator then 
> ActiveRepairService::removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while 
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get 
> a race condition and clearSnapshot will throw a 
> java.nio.file.DirectoryNotEmptyException
>  
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter 
> rateLimiter)
> {
>     if (dir.isDirectory())
>     {
>         String[] children = dir.list();
>         for (String child : children)
>             deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
>     }
>     // The directory is now empty so now it can be smoked
>     deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at 
> the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to