[ 
https://issues.apache.org/jira/browse/CASSANDRA-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15761136#comment-15761136
 ] 

Cristian P commented on CASSANDRA-13052:
----------------------------------------

Stefan, the above code is to highlight the root cause for the problem not the 
actual range. If you provide a small range the method provided in the above 
description will try, recursively, to divide the ranges until left and right 
token have same value. Next recursive iteration will generate a midpoint token 
way out of the suggested repair range.

Here is an example (C* 2.0.14):

See the token range provided for repair: 
(7792951013348769424,7792951013348769525]. That's 100 tokens.
But below you can see the Differencer.java:  "have 119 range(s) out of sync for 
testCF". I would say you cannot split 100 tokens in 119 ranges.

INFO [AntiEntropySessions:1] 2016-12-15 14:52:51,951 RepairSession.java (line 
246) [repair #27437120-c2d6-11e6-b49f-8b496c707234] new session: will sync 
/127.0.0.1, /127.0.0.2, /127.0.0.3 on range 
(7792951013348769424,7792951013348769525] for testKS.[testCF]
INFO [AntiEntropySessions:1] 2016-12-15 14:52:51,960 RepairJob.java (line 161) 
[repair #27437120-c2d6-11e6-b49f-8b496c707234] requesting merkle trees for 
testCF (to [/127.0.0.2, /127.0.0.3, /127.0.0.1])
INFO [AntiEntropyStage:1] 2016-12-15 14:52:52,054 RepairSession.java (line 166) 
[repair #27437120-c2d6-11e6-b49f-8b496c707234] Received merkle tree for testCF 
from /127.0.0.2
INFO [AntiEntropyStage:1] 2016-12-15 14:52:52,064 RepairSession.java (line 166) 
[repair #27437120-c2d6-11e6-b49f-8b496c707234] Received merkle tree for testCF 
from /127.0.0.1
INFO [AntiEntropyStage:1] 2016-12-15 14:52:52,065 RepairSession.java (line 166) 
[repair #27437120-c2d6-11e6-b49f-8b496c707234] Received merkle tree for testCF 
from /127.0.0.3
INFO [RepairJobTask:1] 2016-12-15 14:52:52,071 Differencer.java (line 67) 
[repair #27437120-c2d6-11e6-b49f-8b496c707234] Endpoints /127.0.0.2 and 
/127.0.0.1 are consistent for testCF
INFO [RepairJobTask:3] 2016-12-15 14:52:52,105 Differencer.java (line 74) 
[repair #27437120-c2d6-11e6-b49f-8b496c707234] Endpoints /127.0.0.1 and 
/127.0.0.3 have 119 range(s) out of sync for testCF
INFO [RepairJobTask:2] 2016-12-15 14:52:52,108 Differencer.java (line 74) 
[repair #27437120-c2d6-11e6-b49f-8b496c707234] Endpoints /127.0.0.2 and 
/127.0.0.3 have 119 range(s) out of sync for testCF
INFO [RepairJobTask:2] 2016-12-15 14:52:52,110 StreamingRepairTask.java (line 
77) [repair #27437120-c2d6-11e6-b49f-8b496c707234] Forwarding streaming repair 
of 119 ranges to /127.0.0.2 (to be streamed with /127.0.0.3)
INFO [RepairJobTask:3] 2016-12-15 14:52:52,118 StreamingRepairTask.java (line 
64) [streaming task #27437120-c2d6-11e6-b49f-8b496c707234] Performing streaming 
repair of 119 ranges with /127.0.0.3
INFO [STREAM-IN-/127.0.0.3] 2016-12-15 14:52:53,363 StreamingRepairTask.java 
(line 92) [repair #27437120-c2d6-11e6-b49f-8b496c707234] streaming task 
succeed, returning response to /127.0.0.1
INFO [AntiEntropyStage:1] 2016-12-15 14:52:53,372 RepairSession.java (line 223) 
[repair #27437120-c2d6-11e6-b49f-8b496c707234] testCF is fully synced
INFO [AntiEntropySessions:1] 2016-12-15 14:52:53,373 RepairSession.java (line 
284) [repair #27437120-c2d6-11e6-b49f-8b496c707234] session completed 
successfully
INFO [Thread-13] 2016-12-15 14:52:53,378 StorageService.java (line 2644) Repair 
session 27437120-c2d6-11e6-b49f-8b496c707234 for range 
(7792951013348769424,7792951013348769525] finished

> Repair process is violating the start/end token limits for small ranges
> -----------------------------------------------------------------------
>
>                 Key: CASSANDRA-13052
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13052
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Streaming and Messaging
>         Environment: We tried this in 2.0.14 and 3.9, same bug.
>            Reporter: Cristian P
>            Priority: Minor
>
> We tried to do a single token repair by providing 2 consecutive token values 
> for a large column family. We soon notice heavy streaming and according to 
> the logs the number of ranges streamed was in thousands.
> After investigation we found a bug in the two partitioner classes we use 
> (RandomPartitioner and Murmur3Partitioner).
> The midpoint method used by MerkleTree.differenceHelper method to find ranges 
> with differences for streaming returns abnormal values (way out of the 
> initial range requested for repair) if the repair requested range is small (I 
> expect smaller than 2^15).
> Here is the simple code to reproduce the bug for Murmur3Partitioner:
> Token left = new Murmur3Partitioner.LongToken(123456789L);
> Token right = new Murmur3Partitioner.LongToken(123456789L);
> IPartitioner partitioner = new Murmur3Partitioner();
> Token midpoint = partitioner.midpoint(left, right);
> System.out.println("Murmur3: [ " + left.getToken() + " : " + 
> midpoint.getToken() + " : " + right.getToken() + " ]");
> The output is:
> Murmur3: [ 123456789 : -9223372036731319019 : 123456789 ]
> Note that the midpoint token is nowhere near the suggested repair range. This 
> will happen if during the parsing of the tree (in 
> MerkleTree.differenceHelper) in search for differences  there isn't enough 
> tokens for the split and the subrange becomes 0 (left.token=right.token) as 
> in the above test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to