[ https://issues.apache.org/jira/browse/CASSANDRA-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15761136#comment-15761136 ]
Cristian P commented on CASSANDRA-13052: ---------------------------------------- Stefan, the above code is to highlight the root cause for the problem not the actual range. If you provide a small range the method provided in the above description will try, recursively, to divide the ranges until left and right token have same value. Next recursive iteration will generate a midpoint token way out of the suggested repair range. Here is an example (C* 2.0.14): See the token range provided for repair: (7792951013348769424,7792951013348769525]. That's 100 tokens. But below you can see the Differencer.java: "have 119 range(s) out of sync for testCF". I would say you cannot split 100 tokens in 119 ranges. INFO [AntiEntropySessions:1] 2016-12-15 14:52:51,951 RepairSession.java (line 246) [repair #27437120-c2d6-11e6-b49f-8b496c707234] new session: will sync /127.0.0.1, /127.0.0.2, /127.0.0.3 on range (7792951013348769424,7792951013348769525] for testKS.[testCF] INFO [AntiEntropySessions:1] 2016-12-15 14:52:51,960 RepairJob.java (line 161) [repair #27437120-c2d6-11e6-b49f-8b496c707234] requesting merkle trees for testCF (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) INFO [AntiEntropyStage:1] 2016-12-15 14:52:52,054 RepairSession.java (line 166) [repair #27437120-c2d6-11e6-b49f-8b496c707234] Received merkle tree for testCF from /127.0.0.2 INFO [AntiEntropyStage:1] 2016-12-15 14:52:52,064 RepairSession.java (line 166) [repair #27437120-c2d6-11e6-b49f-8b496c707234] Received merkle tree for testCF from /127.0.0.1 INFO [AntiEntropyStage:1] 2016-12-15 14:52:52,065 RepairSession.java (line 166) [repair #27437120-c2d6-11e6-b49f-8b496c707234] Received merkle tree for testCF from /127.0.0.3 INFO [RepairJobTask:1] 2016-12-15 14:52:52,071 Differencer.java (line 67) [repair #27437120-c2d6-11e6-b49f-8b496c707234] Endpoints /127.0.0.2 and /127.0.0.1 are consistent for testCF INFO [RepairJobTask:3] 2016-12-15 14:52:52,105 Differencer.java (line 74) [repair #27437120-c2d6-11e6-b49f-8b496c707234] Endpoints /127.0.0.1 and /127.0.0.3 have 119 range(s) out of sync for testCF INFO [RepairJobTask:2] 2016-12-15 14:52:52,108 Differencer.java (line 74) [repair #27437120-c2d6-11e6-b49f-8b496c707234] Endpoints /127.0.0.2 and /127.0.0.3 have 119 range(s) out of sync for testCF INFO [RepairJobTask:2] 2016-12-15 14:52:52,110 StreamingRepairTask.java (line 77) [repair #27437120-c2d6-11e6-b49f-8b496c707234] Forwarding streaming repair of 119 ranges to /127.0.0.2 (to be streamed with /127.0.0.3) INFO [RepairJobTask:3] 2016-12-15 14:52:52,118 StreamingRepairTask.java (line 64) [streaming task #27437120-c2d6-11e6-b49f-8b496c707234] Performing streaming repair of 119 ranges with /127.0.0.3 INFO [STREAM-IN-/127.0.0.3] 2016-12-15 14:52:53,363 StreamingRepairTask.java (line 92) [repair #27437120-c2d6-11e6-b49f-8b496c707234] streaming task succeed, returning response to /127.0.0.1 INFO [AntiEntropyStage:1] 2016-12-15 14:52:53,372 RepairSession.java (line 223) [repair #27437120-c2d6-11e6-b49f-8b496c707234] testCF is fully synced INFO [AntiEntropySessions:1] 2016-12-15 14:52:53,373 RepairSession.java (line 284) [repair #27437120-c2d6-11e6-b49f-8b496c707234] session completed successfully INFO [Thread-13] 2016-12-15 14:52:53,378 StorageService.java (line 2644) Repair session 27437120-c2d6-11e6-b49f-8b496c707234 for range (7792951013348769424,7792951013348769525] finished > Repair process is violating the start/end token limits for small ranges > ----------------------------------------------------------------------- > > Key: CASSANDRA-13052 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13052 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging > Environment: We tried this in 2.0.14 and 3.9, same bug. > Reporter: Cristian P > Priority: Minor > > We tried to do a single token repair by providing 2 consecutive token values > for a large column family. We soon notice heavy streaming and according to > the logs the number of ranges streamed was in thousands. > After investigation we found a bug in the two partitioner classes we use > (RandomPartitioner and Murmur3Partitioner). > The midpoint method used by MerkleTree.differenceHelper method to find ranges > with differences for streaming returns abnormal values (way out of the > initial range requested for repair) if the repair requested range is small (I > expect smaller than 2^15). > Here is the simple code to reproduce the bug for Murmur3Partitioner: > Token left = new Murmur3Partitioner.LongToken(123456789L); > Token right = new Murmur3Partitioner.LongToken(123456789L); > IPartitioner partitioner = new Murmur3Partitioner(); > Token midpoint = partitioner.midpoint(left, right); > System.out.println("Murmur3: [ " + left.getToken() + " : " + > midpoint.getToken() + " : " + right.getToken() + " ]"); > The output is: > Murmur3: [ 123456789 : -9223372036731319019 : 123456789 ] > Note that the midpoint token is nowhere near the suggested repair range. This > will happen if during the parsing of the tree (in > MerkleTree.differenceHelper) in search for differences there isn't enough > tokens for the split and the subrange becomes 0 (left.token=right.token) as > in the above test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)