[ 
https://issues.apache.org/jira/browse/CASSANDRA-8177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183058#comment-14183058
 ] 

Yuki Morishita commented on CASSANDRA-8177:
-------------------------------------------

Sequential repair first asks every replica to take snapshot, and then ask each 
replica to calculate Merkle Tree on snapshot one by one. Parallel repair 
There is no difference in I/O operation.

Calculating Merkle Tree ('validation') and streaming SSTables are the two most 
time consuming part of repair.
Sequential repair does validation one node at a time, so it takes time compared 
to parallel.

Which is 'write' I/O line in the graph? Is it possible that the node is 
streaming SSTables each other for sequential repair, and there are much less 
out of sync SSTables in parallel repair?

> sequential repair is much more expensive than parallel repair
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-8177
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8177
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sean Bridges
>            Assignee: Yuki Morishita
>         Attachments: iostats.png
>
>
> This is with 2.0.10
> The attached graph shows io read/write throughput (as measured with iostat) 
> when doing repairs.
> The large hump on the left is a sequential repair of one node.  The two much 
> smaller peaks on the right are parallel repairs.
> This is a 3 node cluster using vnodes (I know vnodes on small clusters isn't 
> recommended).  Cassandra reports load of 40 gigs.
> We noticed a similar problem with a larger cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to