Differentiate manual repair sessions from automatic
---------------------------------------------------

                 Key: CASSANDRA-1190
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1190
             Project: Cassandra
          Issue Type: Bug
            Reporter: Stu Hood
            Priority: Critical
             Fix For: 0.6.3, 0.7


Currently both manual and automatic repair sessions use the same timeout value: 
TREE_STORE_TIMEOUT. This has the very negative effect of setting a maximum time 
that compaction can take before a manual repair will fail.

For automatic/natural repairs (triggered by two nodes autonomously finishing 
major compactions around the same time), you want a relatively low 
TREE_STORE_TIMEOUT value, because trees generated a long time apart will cause 
a lot of unnecessary repair. The current value is 10 minutes, to optimize for 
this case.

On the other hand, for manual repairs, TREE_STORE_TIMEOUT needs to be 
significantly higher. For instance, if a manual repair is triggered for a 
source node A storing 2 TB of data, and a destination node B with an empty 
store, then node B needs to wait long enough for node A to finish compacting 2 
TB of data, which might take > 12 hours. If a node B times out the local tree 
before node A sends its tree, then the repair will not occur.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to