[ 
https://issues.apache.org/jira/browse/CASSANDRA-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13434150#comment-13434150
 ] 

Jonathan Ellis commented on CASSANDRA-4482:
-------------------------------------------

bq. The commutative properties of XOR make it possible to update the MT 
incrementally without having to read on write

Hang on, let's flesh this out.

I have an md5 hash (or part of one, see below) per row in a MerkleTree 
TreeRange.  I xor all these together to get my initial state, S.  To update row 
A to row A', I need to take S xor hash(A) xor hash(A').

So I still need to read-on-write to compute hash(A), I just don't have to 
rehash everything else in the same TreeRange.

(I can imagine breaking this down into xoring individual columns, which would 
mean we would only need to read modified columns and not the entire row, but 
the principle is the same.)

bq. For num_tokens=256, that's 1 KB per range on average

I see, you mean vnode ranges.  What I meant was MT TreeRanges...  a MT can have 
64k TR.  Ideally you will have 16 bytes (md5 size) per TR.  You can throw away 
some bytes at the cost of false negatives, i.e., with a single byte per TR, two 
replicas will think they have the same data even when they do not 1/256 of the 
time.

But if you have 64k 1-byte treeranges, how do you fit that into 1KB?  Do you 
reduce the TR granularity further?  64k already feels too low...  although this 
is mitigated somewhat by vnodes.

bq. do have to reload $num_tokens ByteBuffers when creating the 
ColumnFamilyStore

And sync the BB saving with CF flushes so CL replay matches up, I imagine.
                
> In-memory merkle trees for repair
> ---------------------------------
>
>                 Key: CASSANDRA-4482
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4482
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Marcus Eriksson
>
> this sounds cool, we should reimplement it in the open source cassandra;
> http://www.acunu.com/2/post/2012/07/incremental-repair.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to