[ 
https://issues.apache.org/jira/browse/CASSANDRA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889277#comment-13889277
 ] 

Minh Do commented on CASSANDRA-5263:
------------------------------------

Using some generated sstable files within a token range, I ran a test  on 
building the Merkle tree at 20 depth and then add the computed hash values for 
rows (69M added rows).  These 2 steps together are equivalent to a validation 
compaction process on a token range if I am not missing anything.

1. Tree building uses, on the average, 15-18% total CPU resources, and no I/O
2. SSTables scanning and row hash computation use, on the average, 10-12% total 
   CPU resources, and I/O resources limited by the configurable global 
compaction rate limiter 


Given the Jonathan's pointer on using SSTR.estimatedKeysForRanges() to 
calculate number of rows for a SSTable file and no overlapping among SSTable 
files (worst case), we can estimate how many data rows in a given token range.  

>From what I understood, here is the formula to calculate the Merkle tree's 
>depth (assuming each data row has a unique hash value):

1. If number of rows from all SSTables in a given range is approximately equal 
to the maximum number of hash entries in that range (subject to a CF's 
partitioner), thenvwe build the tree at 20 level depth (in the densest case)
2. When number of rows from all SSTables in a given range does not cover the 
full hash range or in sparse case, we build a Merkle tree with a depth less 
than 20. How do we come up with the right depth?  
     depth = 20 * (n rows / max rows) 
where n is the total number of rows in all SSTables and max is the maximum 
number of hash entries in that token range.

However, since different partitions give different max numbers, is there 
anything we can assume to make it easy here like assuming all partitions would 
have the same hash entries in a given token range?  

> Allow Merkle tree maximum depth to be configurable
> --------------------------------------------------
>
>                 Key: CASSANDRA-5263
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5263
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Config
>    Affects Versions: 1.1.9
>            Reporter: Ahmed Bashir
>            Assignee: Minh Do
>
> Currently, the maximum depth allowed for Merkle trees is hardcoded as 15.  
> This value should be configurable, just like phi_convict_treshold and other 
> properties.
> Given a cluster with nodes responsible for a large number of row keys, Merkle 
> tree comparisons can result in a large amount of unnecessary row keys being 
> streamed.
> Empirical testing indicates that reasonable changes to this depth (18, 20, 
> etc) don't affect the Merkle tree generation and differencing timings all 
> that much, and they can significantly reduce the amount of data being 
> streamed during repair. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to