[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261376#comment-14261376
 ] 

Jeremy Hanna commented on CASSANDRA-5220:
-----------------------------------------

I think it's important to reiterate that the project devs recognize that these 
inefficiencies are impacting many users.  However, lots of parallel work is 
getting done on repair.  As Yuki pointed out, with incremental repair 
(CASSANDRA-5351) already in 2.1 and improving the concurrency of the repair 
process (CASSANDRA-6455) coming in 3.0, many of the problems seen in this 
ticket will be resolved.

Until 2.1/3.0, sub-range repair (CASSANDRA-5280) is helpful to parallelize and 
repair more efficiently with virtual nodes.  See 
http://www.datastax.com/dev/blog/advanced-repair-techniques for details about 
efficiency gains with sub-range repair.  It's just more tedious to track.  
Saving repair data to a system table (CASSANDRA-5839) will help track that in 
Cassandra itself.

> Repair improvements when using vnodes
> -------------------------------------
>
>                 Key: CASSANDRA-5220
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.2.0 beta 1
>            Reporter: Brandon Williams
>            Assignee: Yuki Morishita
>              Labels: performance, repair
>         Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2
>
>
> Currently when using vnodes, repair takes much longer to complete than 
> without them.  This appears at least in part because it's using a session per 
> range and processing them sequentially.  This generates a lot of log spam 
> with vnodes, and while being gentler and lighter on hard disk deployments, 
> ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to