On Mon, Apr 13, 2015 at 3:33 PM, Jeff Ferland <j...@tubularlabs.com> wrote:

> Nodetool repair -par: covers all nodes, computes merkle trees for each
> node at the same time. Much higher IO load as every copy of a key range is
> scanned at once. Can be totally OK with SSDs and throughput limits.  Only
> need to run the command one node.
>

No? -par is just a performance (of repair) de-optimization, intended to
improve service time during repair. Doing -par without -pr on a single node
doesn't repair your entire cluster.

Consider the following 7 node cluster, without vnodes :

A B C D E F G
RF=3

You run a repair on node D, without -pr.

D is repaired against B's tertiary replicas.
D is repaired against C's secondary replicas.
E is repaired against D's secondary replicas.
F is repaired against D's tertiary replicas.
Nodes A and G are completely unaffected and unrepaired, because D does not
share any ranges with them.

repair with or without -par only covers all *replica* nodes. Even with
vnodes, you still have to run it on almost all nodes in most cases. Which
is why most users should save themselves the complexity and just do a
rolling -par -pr on all nodes, one by one.

=Rob

Reply via email to