Blake Eggleston created CASSANDRA-13797:
-------------------------------------------

             Summary: RepairJob blocks on syncTasks
                 Key: CASSANDRA-13797
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13797
             Project: Cassandra
          Issue Type: Bug
            Reporter: Blake Eggleston
            Assignee: Blake Eggleston
             Fix For: 4.0


The thread running {{RepairJob}} blocks while it waits for the validations it 
starts to complete ([see 
here|https://github.com/bdeggleston/cassandra/blob/9fdec0a82851f5c35cd21d02e8c4da8fc685edb2/src/java/org/apache/cassandra/repair/RepairJob.java#L185]).
 However, the downstream callbacks (ie: the post-repair cleanup stuff) aren't 
waiting for {{RepairJob#run}} to return, they're waiting for a result to be set 
on RepairJob the future, which happens after the sync tasks have completed. 
This post repair cleanup stuff also immediately shuts down the executor 
{{RepairJob#run}} is running in. So in noop repair sessions, where there's 
nothing to stream, I'm seeing the callbacks sometimes fire before 
{{RepairJob#run}} wakes up, and causing an {{InterruptedException}} is thrown.

I'm pretty sure this can just be removed, but I'd like a second opinion. This 
appears to just be a holdover from before repair coordination became async. I 
thought it might be doing some throttling by blocking, but each repair session 
gets it's own executor, and validation is  throttled by the fixed size 
executors doing the actual work of validation, so I don't think we need to keep 
this around.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to