[jira] [Resolved] (CASSANDRA-2541) Improve the precision of the repair merkle trees

Sylvain Lebresne (JIRA) Wed, 04 May 2011 08:48:46 -0700

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sylvain Lebresne resolved CASSANDRA-2541.
-----------------------------------------

    Resolution: Invalid

Well I was actually wrong in that the splitting reuse the sample as long as it 
doesn't have a complete tree (complete in the sens of depth or size greater 
that there fixed limits). So I think there is no particular problem here.

There is a small bug in 0.8 code that can make the splitting process exit 
early, but I'll open another ticket for that.

> Improve the precision of the repair merkle trees
> ------------------------------------------------
>
>                 Key: CASSANDRA-2541
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2541
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>              Labels: repair
>             Fix For: 0.8.1
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> Repair uses the sstable sampled keys to split the merkle tree. This means the 
> 'precision' of the tree will be index_interval (so 128 by default). This is 
> probably fine when you have lots of skinny rows. But when you have less fat 
> rows, this is probably unnecessary imprecise.
> Added to that the fact that each node will not have the same set of samples, 
> you may not always end up using the more precise range in the trees when 
> computing differences, which could make the imprecision worst (to be fair, it 
> is quite possible this happens very rarely).
> Anyway, this ticket proposes to add an additional 'split_factor' (can be 
> fixed, can be configurable (by the user or based on metrics on how fat the 
> rows are)) that makes use re-split 'split_factor' times each ranges after the 
> initial sample-based split of the tree.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-2541) Improve the precision of the repair merkle trees

Reply via email to