[ https://issues.apache.org/jira/browse/CASSANDRA-8177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183058#comment-14183058 ]
Yuki Morishita commented on CASSANDRA-8177: ------------------------------------------- Sequential repair first asks every replica to take snapshot, and then ask each replica to calculate Merkle Tree on snapshot one by one. Parallel repair There is no difference in I/O operation. Calculating Merkle Tree ('validation') and streaming SSTables are the two most time consuming part of repair. Sequential repair does validation one node at a time, so it takes time compared to parallel. Which is 'write' I/O line in the graph? Is it possible that the node is streaming SSTables each other for sequential repair, and there are much less out of sync SSTables in parallel repair? > sequential repair is much more expensive than parallel repair > ------------------------------------------------------------- > > Key: CASSANDRA-8177 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8177 > Project: Cassandra > Issue Type: Bug > Reporter: Sean Bridges > Assignee: Yuki Morishita > Attachments: iostats.png > > > This is with 2.0.10 > The attached graph shows io read/write throughput (as measured with iostat) > when doing repairs. > The large hump on the left is a sequential repair of one node. The two much > smaller peaks on the right are parallel repairs. > This is a 3 node cluster using vnodes (I know vnodes on small clusters isn't > recommended). Cassandra reports load of 40 gigs. > We noticed a similar problem with a larger cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)