[Cassandra Wiki] Update of "Operations" by PeterSchulle r

Apache Wiki Mon, 10 Jan 2011 14:20:44 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "Operations" page has been changed by PeterSchuller.
The comment on this change is: Document how to deal with lack of repair within 
GCGraceSeconds.
http://wiki.apache.org/cassandra/Operations?action=diff&rev1=75&rev2=76

--------------------------------------------------

  === Frequency of nodetool repair ===
  
  Unless your application performs no deletes, it is vital that production 
clusters run `nodetool repair` periodically on all nodes in the cluster. The 
hard requirement for repair frequency is the value used for GCGraceSeconds (see 
[[DistributedDeletes]]). Running nodetool repair often enough to guarantee that 
all nodes have performed a repair in a given period GCGraceSeconds long, 
ensures that deletes are not "forgotten" in the cluster.
+ 
+ ==== Dealing with the consequences of nodetool repair not running within 
GCGraceSeconds ====
+ 
+ If `nodetool repair` has not been run often enough to the pointthat 
GCGraceSeconds has passed, you risk forgotten deletes (see 
[[DistributedDeletes]]). In addition to data popping up that has been deleted, 
you may see inconsistencies in data return from different nodes that will not 
self-heal by read-repair or further `nodetool repair`. Some further details on 
this latter effect is documented in 
[[https://issues.apache.org/jira/browse/CASSANDRA-1316|CASSANDRA-1316]].
+ 
+ There are at least three ways to deal with this scenario.
+ 
+ #1 Treat the node in question as failed, and replace it as described further 
below.
+ #2 To minimize the amount of forgotten deletes, first increase GCGraceSeconds 
across the cluster (rolling restart required), perform a full repair on all 
nodes, and then change GCRaceSeconds back again. This has the advantage of 
ensuring tombstones spread as much as possible, minimizing the amount of data 
that may "pop back up" (forgotten delete).
+ #3 Yet another option, that will result in more forgotten deletes than the 
previous suggestion but is easier to do, is to ensure 'nodetool repair' has 
been run on all nodes, and then perform a compaction to expire toombstones. 
Following this, read-repair and regular `nodetool repair` should cause the 
cluster to converge.
  
  === Handling failure ===
  If a node goes down and comes back up, the ordinary repair mechanisms will be 
adequate to deal with any inconsistent data.  Remember though that if a node 
misses updates and is not repaired for longer than your configured 
GCGraceSeconds (default: 10 days), it could have missed remove operations 
permanently.  Unless your application performs no removes, you should wipe its 
data directory, re-bootstrap it, and removetoken its old entry in the ring (see 
below).

[Cassandra Wiki] Update of "Operations" by PeterSchulle r

Reply via email to