[ 
https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107876#comment-15107876
 ] 

Anuj Wadehra commented on CASSANDRA-10446:
------------------------------------------

I think, this an issue with the way we handled the "downed replica" scenario in 
repairs. We should increase the priority and change the type from Improvement 
to Bug.

Consider following scenario and flow of events which demonstrate the importance 
of this issue:
Scenario: I have a 20 node clsuter, RF=5, Read/Write Quorum, gc grace 
period=20. My cluster is fault tolerant and it can afford 2 node failures.

Suddenly, one node goes down due to some hardware issue. The failed node would 
prevent repair on many nodes in the cluster as it has approximately 5/20th 
share of total data ..1/20 which it owns and 4/20 which is stored as replica of 
data owned by other nodes. Now Its 10 days since the node is down, most of the 
nodes are not being repaired and now its decision time. I am not sure how soon 
the issue would be fixed may be next 2 days i.e. 8 days before gc grace, so I 
shouldnt remove node early and add node back as it would cause significant and 
unnecessary streaming due to token re-arrangement. At the same time, if I dont 
remove the failed node at this time i.e. 10 days (much before gc grace), my 
entire system health would be in question and it would be a panic situation as 
most of the data didnt get repaired in last 10 days and gc grace is 
approaching. I need sufficient time to repair all nodes.
What looked like a fault tolerant Cassandra cluster which can easily afford 2 
node failure, required urgent attention and manual decision making when a 
single node went down. If some replicas are down, we should allow Repair to 
proceed with remaining replicas. If failed nodes comes up before gc grace 
period, we would run repair to fix inconsistencies and otheriwse we would 
discard data and bootstrap. I think that would be a really robust fault 
tolerant system.



> Run repair with down replicas
> -----------------------------
>
>                 Key: CASSANDRA-10446
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10446
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Priority: Minor
>             Fix For: 3.x
>
>
> We should have an option of running repair when replicas are down. We can 
> call it -force.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to