[ https://issues.apache.org/jira/browse/CASSANDRA-17168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marcus Eriksson updated CASSANDRA-17168: ---------------------------------------- Bug Category: Parent values: Availability(12983)Level 1 values: Unavailable(12994) Complexity: Normal Component/s: Consistency/Repair Discovered By: Adhoc Test Fix Version/s: 4.0.x 4.x Reviewers: David Capwell Severity: Normal Status: Open (was: Triage Needed) trunk: https://github.com/apache/cassandra/pull/1340 https://app.circleci.com/pipelines/github/krummas/cassandra?branch=marcuse%2F17168-trunk 4.0: https://github.com/apache/cassandra/pull/1341 https://app.circleci.com/pipelines/github/krummas/cassandra?branch=marcuse%2F17168 note that the trunk version includes a change to the PREPARE message to include repair parallelism instead of setting a flag on ParentRepairSession > Don't block gossip when clearing snapshots for failing repairs > -------------------------------------------------------------- > > Key: CASSANDRA-17168 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17168 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair > Reporter: Marcus Eriksson > Assignee: Marcus Eriksson > Priority: Normal > Fix For: 4.0.x, 4.x > > > We clear snapshots in the GossipTasks thread when a repair session fails due > to a replica shutting down. If there are many tables/repair sessions ongoing > this can take a long time. With enough tables being repaired at the same time > even checking if the snapshots exists can take long enough to mark nodes down. > We should clear snapshots in a separate thread and add a flag to tell us > whether this repair session can have snapshots to avoid checking if the > directory exists. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org