Improve the precision of the repair merkle trees ------------------------------------------------
Key: CASSANDRA-2541 URL: https://issues.apache.org/jira/browse/CASSANDRA-2541 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.6 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.8.1 Repair uses the sstable sampled keys to split the merkle tree. This means the 'precision' of the tree will be index_interval (so 128 by default). This is probably fine when you have lots of skinny rows. But when you have less fat rows, this is probably unnecessary imprecise. Added to that the fact that each node will not have the same set of samples, you may not always end up using the more precise range in the trees when computing differences, which could make the imprecision worst (to be fair, it is quite possible this happens very rarely). Anyway, this ticket proposes to add an additional 'split_factor' (can be fixed, can be configurable (by the user or based on metrics on how fat the rows are)) that makes use re-split 'split_factor' times each ranges after the initial sample-based split of the tree. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira