[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses

Berenguer Blasi (Jira) Tue, 30 Nov 2021 21:48:09 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451511#comment-17451511
 ]


Berenguer Blasi commented on CASSANDRA-16446:
---------------------------------------------

[~dcapwell] I don't remember any specific reasons. Also reading the code 
diagonally I don't see a reason why we couldn't cleanup also on failures. But 
this is not a part of the code I know by heart so I guess the best is to give 
it a go and see what happens?

> Parent repair sessions leak may lead to node long pauses
> --------------------------------------------------------
>
>                 Key: CASSANDRA-16446
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16446
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair
>            Reporter: Berenguer Blasi
>            Assignee: Berenguer Blasi
>            Priority: Normal
>             Fix For: 4.0-rc1, 4.0
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> {{ActiveRepairService}} keeps  a map `parentRepairSessions`. If these 
> sessions leak, that map can grow to a size when a node restarts 
> {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can 
> pause nodes in a cluster for a long time.
> The proposed solution is for repairs to cleanup these sessions on all nodes 
> on completion by sending a CLEANUP message to involved nodes. Tests rely on a 
> new {{parentRepairSessionsCount()}} method on the parent repair sessions 
> MBean to keep track of these.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses

Reply via email to