[ https://issues.apache.org/jira/browse/CASSANDRA-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sylvain Lebresne resolved CASSANDRA-3316. ----------------------------------------- Resolution: Fixed Reviewer: slebresne +1, committed. I don't think it's worth adding a nodetool command (more precisely I think it's a feature that it's not too easy to trigger this) because we don't expect people to use that hopefully. It's more to have a solution available if it comes to that. > Add a JMX call to force cleaning repair sessions (in case they are hang up) > --------------------------------------------------------------------------- > > Key: CASSANDRA-3316 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3316 > Project: Cassandra > Issue Type: Improvement > Components: Core > Affects Versions: 0.8.6 > Reporter: Sylvain Lebresne > Assignee: Yuki Morishita > Priority: Minor > Fix For: 1.0.2 > > Attachments: 3316-v1.txt > > > A repair session contains many parts, most of which are not local to the node > (implying the node waits on those operation). You request merkle trees, then > you schedule streaming (and in 1.0.0, some of the streaming don't involve the > local node itself). It's lots of place where something can go wrong, and if > so it leaves the repair hanging and as a consequence it leaves a > repairSessions tasks sitting active on the 'AntiEntropy Session' executor. > Obviously, we should improve the detection by repair of those things that can > go wrong. CASSANDRA-2433 started and CASSANDRA-3112 is open to fill as much > of the remaining parts as possible, but my bet is that it will be hard to > cover everything (and it may not be worth of handling very improbable failure > scenario). Besides CASSANDRA-3112 will involve change in the wire protocol, > so it may take some time to be committed. In the meantime, it would be nice > to provide a JMX call to force terminating repairSessions so that you don't > end up in the case where you have enough 'zombie' sessions on the executor > that you can't submit new ones (you could restart the node but it's ugly). > Anyway, it's not a big issue but it would be simple to add such a JMX call. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira