[ 
https://issues.apache.org/jira/browse/CASSANDRA-20877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Konstantinov reassigned CASSANDRA-20877:
-----------------------------------------------

    Assignee: Dmitry Konstantinov

> FINALIZED incremental local repair sessions are not cleaned up in case of a 
> range movement 
> -------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-20877
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20877
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>
> * system.repairs table is local per each Cassandra node.
>  * This table is cleaned up by a periodically running 
> org.apache.cassandra.repair.consistent.LocalSessions#cleanup() job.
>  * The job runs every cassandra.repair_cleanup_interval_seconds (with default 
> = 10 minutes).
>  * The job should delete repair sessions with FINALIZED state which are older 
> than cassandra.repair_delete_timeout_seconds (with default value = 1 day).
>  * Before deleting of a FINALIZED session 
> org.apache.cassandra.repair.consistent.LocalSessions#isSuperseded check is 
> executed for them to ensure if all ranges and tables covered by this session 
> have since been re-repaired by a more recent session. If it is not superseded 
> the session info delete from the table is skipped and a log message is 
> printed:
> {code:java}
> Skipping delete of FINALIZED LocalSession {repairSessionId} because it has 
> not been superseded by a more recent session"{code}
>  * isSuperseded logic allows to delete a repair session info only if all 
> session ranges are covered by some newer session on the node.
> If we added a new node then a set of ranges is moved to it and for these 
> ranges data are not repaired anymore on the old nodes, so isSuperseded always 
> return false for the last session executed before the node adding.
> If we have a big cluster with a lot of nodes added while an incremental 
> repair is executed regularly then we get a lot of non-removable old records 
> in system.repairs table it may slow down startup for Cassandra nodes 
> especially if a large number of tokens is used on the cluster historically.
> A similar issue is with a table removal, the logic consider the last session 
> which was executed for a removed table as non-superseded and keeps it forever.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to