[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451934#comment-17451934 ] David Capwell commented on CASSANDRA-16446: --- Thanks, saw it while refactoring and wasn't sure if there was a good reason. What I see now is that both FINALIZE_COMMIT and CLEANUP touch different maps, so I don't see a clear conflict with IR so makes sense to me to cleanup on failure. > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-rc1, 4.0 > > Time Spent: 1h > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion by sending a CLEANUP message to involved nodes. Tests rely on a > new {{parentRepairSessionsCount()}} method on the parent repair sessions > MBean to keep track of these. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451511#comment-17451511 ] Berenguer Blasi commented on CASSANDRA-16446: - [~dcapwell] I don't remember any specific reasons. Also reading the code diagonally I don't see a reason why we couldn't cleanup also on failures. But this is not a part of the code I know by heart so I guess the best is to give it a go and see what happens? > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-rc1, 4.0 > > Time Spent: 1h > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion by sending a CLEANUP message to involved nodes. Tests rely on a > new {{parentRepairSessionsCount()}} method on the parent repair sessions > MBean to keep track of these. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451288#comment-17451288 ] David Capwell commented on CASSANDRA-16446: --- I don't see conversation in JIRA or GH, was wondering why cleanup is success only and does not include failure? Best I see is https://github.com/apache/cassandra/pull/896#discussion_r577334272 bq. Parent session is removed as part of the success() call path. If a session is failed, we can't recover or really act on it... Was the concern IR? > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-rc1, 4.0 > > Time Spent: 1h > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion by sending a CLEANUP message to involved nodes. Tests rely on a > new {{parentRepairSessionsCount()}} method on the parent repair sessions > MBean to keep track of these. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290730#comment-17290730 ] Jaroslaw Grabowski commented on CASSANDRA-16446: Thank you [~Bereng]! > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta5, 4.0 > > Time Spent: 1h > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion by sending a CLEANUP message to involved nodes. Tests rely on a > new {{parentRepairSessionsCount()}} method on the parent repair sessions > MBean to keep track of these. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290703#comment-17290703 ] Berenguer Blasi commented on CASSANDRA-16446: - Thx for all the work :-) > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta5, 4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion by sending a CLEANUP message to involved nodes. Tests rely on a > new {{parentRepairSessionsCount()}} method on the parent repair sessions > MBean to keep track of these. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290195#comment-17290195 ] Andres de la Peña commented on CASSANDRA-16446: --- Committed to {{trunk}} as [23512cf3da5e8206d8797841f2238cdd86c13d96|https://github.com/apache/cassandra/commit/23512cf3da5e8206d8797841f2238cdd86c13d96]. Dtests committed as [c89dea0e8c38ed35ed40d59c975a07585584a637|https://github.com/apache/cassandra-dtest/commit/c89dea0e8c38ed35ed40d59c975a07585584a637]. > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0, 4.0-beta > > Time Spent: 50m > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion by sending a CLEANUP message to involved nodes. Tests rely on a > new {{parentRepairSessionsCount()}} method on the parent repair sessions > MBean to keep track of these. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17289937#comment-17289937 ] Berenguer Blasi commented on CASSANDRA-16446: - Ah yes I didn't rebase. I just added the @jira_ticket & the reformatted line. Apologies I didn't get it you wanted it rebased. > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0, 4.0-beta > > Time Spent: 50m > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion by sending a CLEANUP message to involved nodes. Tests rely on a > new {{parentRepairSessionsCount()}} method on the parent repair sessions > MBean to keep track of these. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17289869#comment-17289869 ] Andres de la Peña commented on CASSANDRA-16446: --- I see that the PRs are not rebased, and CircleCI is pointing to a different dtest branch that seems identical except for the last {{@jira_ticket}} tags. I have rebased the branches (on my repo, [here|https://github.com/adelapena/cassandra/tree/CASSANDRA-16446-review] and [here|https://github.com/adelapena/cassandra-dtest/tree/CASSANDRA-16446-review]) and I'm running that final CI round: * [circle j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/193/workflows/15564af1-1247-4b27-99f8-bb04c38b3ae6] * [circle j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/193/workflows/75dbbd96-ade0-4ef0-82d1-96aab0e68fe4] * [jenkins|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/399/pipeline] > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0, 4.0-beta > > Time Spent: 50m > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion by sending a CLEANUP message to involved nodes. Tests rely on a > new {{parentRepairSessionsCount()}} method on the parent repair sessions > MBean to keep track of these. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17289861#comment-17289861 ] Andres de la Peña commented on CASSANDRA-16446: --- Great, I'm running a final CI round [here|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/398/pipeline] after the rebase just in case, I can commit once it finishes. > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0, 4.0-beta > > Time Spent: 50m > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion by sending a CLEANUP message to involved nodes. Tests rely on a > new {{parentRepairSessionsCount()}} method on the parent repair sessions > MBean to keep track of these. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286857#comment-17286857 ] Berenguer Blasi commented on CASSANDRA-16446: - I'd say it lgtm. There are some timeouts and then the 16411 failures waiting to be merged in. > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta5, 4.0, 4.0-beta > > Time Spent: 0.5h > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion by sending a CLEANUP message to involved nodes. Tests rely on a > new {{parentRepairSessionCount()}} method on the parent repair sessions MBean > to keep track of these. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286486#comment-17286486 ] Ekaterina Dimitrova commented on CASSANDRA-16446: - New Jenkins CI run submitted [here | https://jenkins-cm4.apache.org/job/Cassandra-devbranch/387/] > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta5, 4.0, 4.0-beta > > Time Spent: 0.5h > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion by sending a CLEANUP message to involved nodes. Tests rely on a > new {{parentRepairSessionCount()}} method on the parent repair sessions MBean > to keep track of these. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286283#comment-17286283 ] Berenguer Blasi commented on CASSANDRA-16446: - Ah it failed bc of the rename we did... A new jenkins run should be clean now. > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta5, 4.0, 4.0-beta > > Time Spent: 0.5h > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion by sending a CLEANUP message to involved nodes. Tests rely on a > new {{parentRepairSessionCount()}} method on the parent repair sessions MBean > to keep track of these. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286146#comment-17286146 ] Ekaterina Dimitrova commented on CASSANDRA-16446: - Jenkins run pushed [here| https://jenkins-cm4.apache.org/job/Cassandra-devbranch/385/] > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta5, 4.0, 4.0-beta > > Time Spent: 0.5h > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion by sending a CLEANUP message to involved nodes. Tests rely on a > new {{parentRepairSessionCount()}} method on the parent repair sessions MBean > to keep track of these. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285597#comment-17285597 ] Ekaterina Dimitrova commented on CASSANDRA-16446: - I did a first pass, left a few small comments/questions but in general it looks good. We need second reviewer. [~adelapena], you were revising the testing for repair if I recall correctly, do you think you will have time to check this one, too, please? > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta5, 4.0, 4.0-beta > > Time Spent: 20m > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285298#comment-17285298 ] Ekaterina Dimitrova commented on CASSANDRA-16446: - No worries at all, just wanted to be sure you don't have in mind some new changes to add :) Thanks for confirming! > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta5, 4.0 > > Time Spent: 10m > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285291#comment-17285291 ] Berenguer Blasi commented on CASSANDRA-16446: - Mmmm yes it is. Weird I must have missed moving the status forward. Apologies. > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta5, 4.0 > > Time Spent: 10m > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285257#comment-17285257 ] Ekaterina Dimitrova commented on CASSANDRA-16446: - This is ready for review, right? As I see it still in status "work in progress" > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta5, 4.0 > > Time Spent: 10m > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses
[ https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17284611#comment-17284611 ] Berenguer Blasi commented on CASSANDRA-16446: - All praise and glory to [~jtgrabowski] for the original solution :-) > Parent repair sessions leak may lead to node long pauses > > > Key: CASSANDRA-16446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16446 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta5, 4.0 > > Time Spent: 10m > Remaining Estimate: 0h > > {{ActiveRepairService}} keeps a map `parentRepairSessions`. If these > sessions leak, that map can grow to a size when a node restarts > {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can > pause nodes in a cluster for a long time. > The proposed solution is for repairs to cleanup these sessions on all nodes > on completion. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org