[ https://issues.apache.org/jira/browse/CASSANDRA-20877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dmitry Konstantinov updated CASSANDRA-20877: -------------------------------------------- Complexity: Low Hanging Fruit Discovered By: User Report Severity: Normal Status: Open (was: Triage Needed) > FINALIZED incremental local repair sessions are not cleaned up in case of a > range movement > ------------------------------------------------------------------------------------------- > > Key: CASSANDRA-20877 > URL: https://issues.apache.org/jira/browse/CASSANDRA-20877 > Project: Apache Cassandra > Issue Type: Bug > Components: Consistency/Repair > Reporter: Dmitry Konstantinov > Assignee: Dmitry Konstantinov > Priority: Normal > > * system.repairs table is local per each Cassandra node. > * This table is cleaned up by a periodically running > org.apache.cassandra.repair.consistent.LocalSessions#cleanup() job. > * The job runs every cassandra.repair_cleanup_interval_seconds (with default > = 10 minutes). > * The job should delete repair sessions with FINALIZED state which are older > than cassandra.repair_delete_timeout_seconds (with default value = 1 day). > * Before deleting of a FINALIZED session > org.apache.cassandra.repair.consistent.LocalSessions#isSuperseded check is > executed for them to ensure if all ranges and tables covered by this session > have since been re-repaired by a more recent session. If it is not superseded > the session info delete from the table is skipped and a log message is > printed: > {code:java} > Skipping delete of FINALIZED LocalSession {repairSessionId} because it has > not been superseded by a more recent session"{code} > * isSuperseded logic allows to delete a repair session info only if all > session ranges are covered by some newer session on the node. > If we added a new node then a set of ranges is moved to it and for these > ranges data are not repaired anymore on the old nodes, so isSuperseded always > return false for the last session executed before the node adding. > If we have a big cluster with a lot of nodes added while an incremental > repair is executed regularly then we get a lot of non-removable old records > in system.repairs table it may slow down startup for Cassandra nodes > especially if a large number of tokens is used on the cluster historically. > A similar issue is with a table removal, the logic consider the last session > which was executed for a removed table as non-superseded and keeps it forever. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org