[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681352#comment-14681352 ] Marcus Olsson commented on CASSANDRA-5220: -- LGTM! > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14680959#comment-14680959 ] Stefania commented on CASSANDRA-5220: - [~molsson] could you quickly review the coverity patch I linked in my comment above? Then, if all good, [~jbellis] could you commit it and resolve this ticket? > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661132#comment-14661132 ] Stefania commented on CASSANDRA-5220: - Attaching information on the Coverity defects reported against this patch. I propose to handle as follows: * CID 1315416 - TokenRangeComparator should be serializable - ignore since we have several other comparators that are not serializable * CID 1315412: RESOURCE_LEAK in CompactionStrategyManager line 368 - false positive, even though the lists aren't closed, their iterators are closed or returned for future use, same problem was already present before * CID 1315410: Possible NPE in MerkleTrees line 172 - this code is for testing only so I propose to convert possible NPE to AssertionError and add @VisibleForTesting * CID 1315409: Possible NPE in MerkleTrees line 130 - this code is for testing only so I propose to convert possible NPE to AssertionError and add @VisibleForTesting * CID 1315407: Possible NPE in MerkleTrees line 162 - I verified it cannot happen so I propose to convert possible NPE to AssertionError Proposed patch on the [3.0 branch|https://github.com/stef1927/cassandra/tree/5220-3.0] : [here|https://github.com/stef1927/cassandra/commit/aa419e331783e78c9aafe79eaeb0362e2338a6b6]. Here are the defects details: {code} ** CID 1315416: FindBugs: Bad practice (FB.SE_COMPARATOR_SHOULD_BE_SERIALIZABLE) /src/java/org/apache/cassandra/utils/MerkleTrees.java: 423 in () *** CID 1315416: FindBugs: Bad practice (FB.SE_COMPARATOR_SHOULD_BE_SERIALIZABLE) /src/java/org/apache/cassandra/utils/MerkleTrees.java: 423 in () 417 } 418 return size; 419 } 420 421 } 422 >>> CID 1315416: FindBugs: Bad practice >>> (FB.SE_COMPARATOR_SHOULD_BE_SERIALIZABLE) >>> org.apache.cassandra.utils.MerkleTrees$TokenRangeComparator implements >>> Comparator but not Serializable. 423 private static class TokenRangeComparator implements Comparator> 424 { 425 @Override 426 public int compare(Range rt1, Range rt2) 427 { 428 if (rt1.left.compareTo(rt2.left) == 0) ** CID 1315412:(RESOURCE_LEAK) /src/java/org/apache/cassandra/db/compaction/CompactionStrategyManager.java: 368 in org.apache.cassandra.db.compaction.CompactionStrategyManager.getScanners(java.util.Collection, java.util.Collection)() /src/java/org/apache/cassandra/db/compaction/CompactionStrategyManager.java: 368 in org.apache.cassandra.db.compaction.CompactionStrategyManager.getScanners(java.util.Collection, java.util.Collection)() *** CID 1315412:(RESOURCE_LEAK) /src/java/org/apache/cassandra/db/compaction/CompactionStrategyManager.java: 368 in org.apache.cassandra.db.compaction.CompactionStrategyManager.getScanners(java.util.Collection, java.util.Collection)() 362 363 for (ISSTableScanner scanner : Iterables.concat(repairedScanners.scanners, unrepairedScanners.scanners)) 364 { 365 if (!scanners.add(scanner)) 366 scanner.close(); 367 } >>> CID 1315412:(RESOURCE_LEAK) >>> Variable "repairedScanners" going out of scope leaks the resource it >>> refers to. 368 } 369 370 return new AbstractCompactionStrategy.ScannerList(new ArrayList<>(scanners)); 371 } 372 373 public synchronized AbstractCompactionStrategy.ScannerList getScanners(Collection sstables) /src/java/org/apache/cassandra/db/compaction/CompactionStrategyManager.java: 368 in org.apache.cassandra.db.compaction.CompactionStrategyManager.getScanners(java.util.Collection, java.util.Collection)() 362 363 for (ISSTableScanner scanner : Iterables.concat(repairedScanners.scanners, unrepairedScanners.scanners)) 364 { 365 if (!scanners.add(scanner)) 366 scanner.close(); 367 } >>> CID 1315412:(RESOURCE_LEAK) >>> Variable "unrepairedScanners" going out of scope leaks the resource it >>> refers to. 368 } 369 370 return new AbstractCompactionStrategy.ScannerList(new ArrayList<>(scanners)); 371 } 372 373 public synchronized AbstractCompactionStrategy.ScannerList getScanners(Collection sstables) ** CID 1315410: Null pointer dereferences (NULL_RETURNS) /src/java/org/apache/cassandra/utils/MerkleTrees.java: 172 in org.apache.cassandra.utils.MerkleTrees.invalidate(org.apache.cassandra.dht.Token)() *** C
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660750#comment-14660750 ] Yuki Morishita commented on CASSANDRA-5220: --- Unfortunately no. This fix involves message format change, so backporting this breaks compatibility within major version. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660740#comment-14660740 ] Kenneth Failbus commented on CASSANDRA-5220: Will this fix be back-ported to 2.0.x or 2.1.x releases. It will be a big help since this would solve and make the product stable on those releases. Since vnodes is a very good functionality for scaling purpose only if repairs keep up. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660003#comment-14660003 ] Jonathan Ellis commented on CASSANDRA-5220: --- Committed. Thanks, Marcus and Stefania! > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659640#comment-14659640 ] Stefania commented on CASSANDRA-5220: - Continuous integration results are comparable to the unpatched cassandra-3.0 results; the [3.0 patch|https://github.com/stef1927/cassandra/commits/5220-3.0] can be committed. [~jbellis] can you take care of this? > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659593#comment-14659593 ] Marcus Olsson commented on CASSANDRA-5220: -- LGTM! :) > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659424#comment-14659424 ] Jonathan Ellis commented on CASSANDRA-5220: --- I'm okay with adding this to 3.0, since otherwise we'll need to wait for either 8110 or 4.0, and I don't think that's fair to Marcus since he had the first version written months ago. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659416#comment-14659416 ] Stefania commented on CASSANDRA-5220: - Thanks, I made a couple more really tiny changes [here|https://github.com/stef1927/cassandra/commit/dbd5c88c6f89ff303f4fece9bb8c5ffa6c3825a1]. The TODO comment above was misplaced sorry, I meant it for {{MerkleTrees}}. You're quite right that we don't need to change the existing trunk behavior. About _repair_history_, I verified it would result in an exception when upgrading from 2.2 with some sstables already on disk. Although I believe we could ask people to wipe this data on a major upgrade, I don't see why inconvenience people and so I went ahead and reverted the old format and inserted one line per rage, see commit [here|https://github.com/stef1927/cassandra/commit/92bd923a8b2d9976dc711f1b7007d25db30d06f9]. Thanks for spotting this. If you confirm these final changes are OK, then I am +1 to commit once CI completes. [~jbellis] I assume we want this on 3.0? If so I ported the patch to _cassandra-3.0_ [here|https://github.com/stef1927/cassandra/commits/5220-3.0]. It is identical to the [trunk patch|https://github.com/stef1927/cassandra/commits/5220] as it applied with no conflicts. You can pick whichever you need depending on where you want to commit to and discard the other one. CI results for trunk will appear here: http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-5220-testall/ http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-5220-dtest/ CI results for 3.0 are instead here: http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-5220-3.0-testall/ http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-5220-3.0-dtest/ > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658349#comment-14658349 ] Marcus Olsson commented on CASSANDRA-5220: -- Created a pull request [here|https://github.com/stef1927/cassandra/pull/2] to your branch. Most comments should've been fixed but there was one in particular I wasn't 100% sure about. In _RepairJobDesc.java_ in the _deserialize()_ method: {quote} // CR-TODO is it safe to use the MS.globalPartitioner() here? range = (Range) AbstractBounds.tokenSerializer.deserialize(in, MessagingService.globalPartitioner(), version); {quote} Not sure what to use instead, but I guess it should be safe since the trunk version uses it. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658262#comment-14658262 ] Marcus Olsson commented on CASSANDRA-5220: -- While looking at CASSANDRA-5839 I realized that this might break something during upgrade from 2.2->3.0, with this patch the table _repair_history_ changes to have a set of ranges instead of a start and end range. (This patch was first done when Should I change the table back and do one insert per range instead? > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658225#comment-14658225 ] Stefania commented on CASSANDRA-5220: - Sounds great, thanks! :) > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14655293#comment-14655293 ] Marcus Olsson commented on CASSANDRA-5220: -- I'm happy to implement it! > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14655230#comment-14655230 ] Stefania commented on CASSANDRA-5220: - As you prefer, in preference you should implement them but if you are busy I can also implement them myself and then you review afterwards. Just let me know. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14655181#comment-14655181 ] Marcus Olsson commented on CASSANDRA-5220: -- Nice, would you like me to take care of the main points/nits/comments as well or would you rather fix them yourself? Regarding the main points: #2 For MerkleTrees serialization I guess we could remove the range and just serialize the MerkleTree's and use the fullRange. #3 I guess I missed that option, it should probably be possible to use TreeMap instead. #4 I don't think the token ranges should overlap, so a few assertions could be useful. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654768#comment-14654768 ] Stefania commented on CASSANDRA-5220: - Great to hear this, then we should be able to commit this soon, the remaining points won't take long at all. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654755#comment-14654755 ] Jonathan Ellis commented on CASSANDRA-5220: --- We can't support repair anyway with older-version nodes until we have CASSANDRA-8110, so don't worry about it here. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654745#comment-14654745 ] Stefania commented on CASSANDRA-5220: - Quite impressive gain indeed! Thanks for fixing those rebase errors too. I've merged your branch into mine and rebased so that we can more easily compare the CI results. As you've noticed some test failures are not related to this patch, so keeping it up-to-date with trunk makes it easier to compare the test results with trunk ([here|http://cassci.datastax.com/job/trunk_testall] and [here|http://cassci.datastax.com/job/trunk_dtest]). I also pushed [another commit|https://github.com/stef1927/cassandra/commit/27615434aec0ce05c2bfa689020b0e00a6409590] with some very minor changes, mostly nits or comments. There are also a couple of trivial things to do marked as {{// CR-TODO}}. I prefer not to clatter the discussion with these trivial matters and to instead focus on the main points, but if upon checking the changes something concerns you then don't hesitate to raise it. Here are the main points: * Do we need to support repair with older replicas? Normally we do support older nodes in a cluster when changing message formats, that's why we have a version in the serializers. So unless repair is different we need to make sure we still send the old message format to the old nodes, which I'm afraid could be a bit of a pain to implement. cc [~jbellis] to confirm. * In {{MerkleTrees.deserialize()}}: is it safe to use {{MessagingService.globalPartitioner()}}? {{MerkleTree}} currently serializes the partitioner name so I would have thought we need to do the same? In fact, why send the range on the wire at all, can we not just take it from the tree {{fullRange}}? * In {{MerkleTrees}}: why do we need a separate list of {{Range}}, isn't a sorted map like a tree map sufficient? * The token ranges should not overlap from what I understand so should we add a couple of assertions in {{MerkleTrees}} to make sure this is the case? (I'm not sure about this one). * By reading the code documentation of {{RepairSession}} I found an old ticket, CASSANDRA-2816. I believe this proposed implementation should be fine as we scan multiple ranges at the same time in the validation compaction but I did not read the entire discussion on that ticket and so I thought I'd mention it here. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654147#comment-14654147 ] Jonathan Ellis commented on CASSANDRA-5220: --- Very substantial. Excited to get this in! > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653834#comment-14653834 ] Marcus Olsson commented on CASSANDRA-5220: -- Thanks, I've managed to get the dtests up and running now! I had some problem running the dtests, but I think there might have been two small misses with the rebase. So the dtests are now working properly with [this|https://github.com/emolsson/cassandra/commits/5220] patch and when running {noformat} PRINT_DEBUG=True nosetests -s -v repair_test.py:TestRepair.simple_parallel_repair_test {noformat} on both trunk and the patched trunk I see an improvement from ~14.2s to ~6.7s repair time, and without vnodes(but with the patch) it was ~2s. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653407#comment-14653407 ] Stefania commented on CASSANDRA-5220: - Thanks for resuming work at such short notice. About the dtests, it's a compatibility issue with the python driver. If you have a copy of the driver git repository then the "cassandra-test" branch should work. Otherwise unzip the driver zip file bundled with the cassandra source (lib/cassandra-driver-internal-only-2.6.0c2.zip) and either install this version (python setup.py install) or make sure the cassandra folder is reachable by the dtests, i.e. by putting it in the same directory as the dtests. If using a local folder you probably also have to uninstall the official driver, if installed at all. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653376#comment-14653376 ] Marcus Olsson commented on CASSANDRA-5220: -- Sorry about the unit tests, I had some problems with ant and junit when I uploaded the patch so I ran those tests through eclipse and it seems that I missed the -ea flag. Also the tests in SerializationsTest seems to be broken(due to the changes for validations). The other two failing tests seems to be failing on trunk as well, so I'm assuming that it's not due to this patch. I'm working based on your rebased branch 5220 and have fixed the unit tests [here|https://github.com/emolsson/cassandra/commits/5220]. Also I seem to be having some problems with running the dtests, is it something special that needs to be done to run the dtests on trunk? I get the following error message: {noformat} NoHostAvailable: ('Unable to connect to any servers', {'127.0.0.1': InvalidRequest(u'code=2200 [Invalid query] message="unconfigured table schema_keyspaces"',)}) {noformat} > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651640#comment-14651640 ] Stefania commented on CASSANDRA-5220: - I've rebased _cassandra-3.0-5220-2.patch_ on to the latest trunk [here|https://github.com/stef1927/cassandra/commits/5220]. I haven't had the time to look at the code in depth yet, I plan to do so in the next few days. Meanwhile, these unit tests are failing: * MerkelTreesTest.testHashRandom * LeveledCompactionStrategyTest.testValidationMultipleSSTablePerLevel Initially I though it was because of the rebase but when I applied the patch onto trunk as of April 2015 [here|https://github.com/stef1927/cassandra/commits/5220-old], with no conflicts, they were also failing. There may be more broken unit tests, I've only checked the ones that were modified by the patch. Eventually the full CI will appear here: http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-5220-testall/ http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-5220-dtest/ > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Marcus Olsson > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511202#comment-14511202 ] Marcus Olsson commented on CASSANDRA-5220: -- Yes I ran the dtest and I see these exceptions as well while running it. The tests I ran before was very basic with three nodes and using the stress tool with the cqlstress-example.yaml profile(changing the replication factor to two) and then ran it with n=100. Then I stopped a node, removed the inserted data and all commitlog entries, started it again and ran a full repair on that node using `repair -full -- stresscql`. The main problem seems to be that it runs out of TreeRange's to iterate over while doing the validation compaction. I have probably done a faulty assumption somewhere and the first thing that comes to mind is that the wrapping iterator is sorting the ranges in a different order compared to how the validation compaction is reading them. Unfortunately I don't have time to debug this further until Monday. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510971#comment-14510971 ] Ryan McGuire commented on CASSANDRA-5220: - Thanks [~molsson], the patch applies correctly now :) > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220-1.patch, cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510776#comment-14510776 ] Marcus Olsson commented on CASSANDRA-5220: -- Hi, yes I had some mix-up with my branches, so this wasn't the latest patch I'm afraid, will try to upload the new patch ASAP. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14509113#comment-14509113 ] Ryan McGuire commented on CASSANDRA-5220: - Hi [~molsson], I'm happy to run some of my own testing on this, but I'm having trouble applying your patch. Can you rebase it or let me know what git SHA your patch applies to? > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508997#comment-14508997 ] Marcus Olsson commented on CASSANDRA-5220: -- I've done some smaller tests to verify that it works, but I haven't had the chance to run performance testing on it. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508973#comment-14508973 ] Jeremiah Jordan commented on CASSANDRA-5220: Did you run any tests to see how this improved things? > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508815#comment-14508815 ] Marcus Olsson commented on CASSANDRA-5220: -- I've done some work on this to make the repair handle multiple ranges at the same time(attaching patch). Essentially what it does is that it finds the common ranges for a set of nodes and repairs them all at the same time. Assume we have three nodes A, B and C, and RF=2 containing the ranges: A -> 1, 2, 3, 4 B -> 3, 4, 5, 6 C -> 1, 2, 5, 6 then if we issue a repair -pr on A it would create two repair sessions: (A, B) -> (3, 4) and (A, C) -> (1, 2) instead of one for each range: (A, B) -> 3 (A, B) -> 4 (A, C) -> 1 (A, C) -> 2 The change is mostly centered around the new utility class MerkleTrees which is a wrapper for multiple MerkleTree's and their associated ranges. This utility class replaces the occurrences of the MerkleTree class in the validator phase and the repair messages. The changes are non-backwards compatible, since the repair job is sending multiple ranges and validation complete sends MerkleTrees instead of MerkleTree. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, > cassandra-3.0-5220.patch > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261376#comment-14261376 ] Jeremy Hanna commented on CASSANDRA-5220: - I think it's important to reiterate that the project devs recognize that these inefficiencies are impacting many users. However, lots of parallel work is getting done on repair. As Yuki pointed out, with incremental repair (CASSANDRA-5351) already in 2.1 and improving the concurrency of the repair process (CASSANDRA-6455) coming in 3.0, many of the problems seen in this ticket will be resolved. Until 2.1/3.0, sub-range repair (CASSANDRA-5280) is helpful to parallelize and repair more efficiently with virtual nodes. See http://www.datastax.com/dev/blog/advanced-repair-techniques for details about efficiency gains with sub-range repair. It's just more tedious to track. Saving repair data to a system table (CASSANDRA-5839) will help track that in Cassandra itself. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Labels: performance, repair > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143627#comment-14143627 ] Yuki Morishita commented on CASSANDRA-5220: --- I'm inclined to mark this 'later' in favor of incremental repair and internal refactoring such as CASSANDRA-6455. Especially, incremental repair should decrease the time needed for validating data, which is one of the major heavy-liftin processes of repair. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Labels: performance, repair > Fix For: 3.0 > > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110288#comment-14110288 ] Jonathan Ellis commented on CASSANDRA-5220: --- bq. I just talked to some people who were seeing an 8 node (256 vnodes each) repair with about 1GB/node take two days. I'm still not sure we have a good handle on this. Is this reproducible? I'm not convinced "spending more time in messaging" is an adequate explanation. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Labels: performance, repair > Fix For: 3.0 > > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019031#comment-14019031 ] Jeffery Schnick commented on CASSANDRA-5220: [~cscetbon] Thank you for the info. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Labels: performance, repair > Fix For: 2.1.1 > > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018504#comment-14018504 ] Cyril Scetbon commented on CASSANDRA-5220: -- [~SchnickDaddy] It's not fixed yet. We just hope it'll be fixed in version 2.1.1, and currently guys are digging to find where is located the overhead that slows the repair > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Labels: performance, repair > Fix For: 2.1.1 > > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018331#comment-14018331 ] Jeffery Schnick commented on CASSANDRA-5220: I see this is fixed in 2.1 rc1, but is there a patch or could I be pointed to the GIT commit that this was addressed? Thanks > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Labels: performance, repair > Fix For: 2.1.1 > > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14005843#comment-14005843 ] Juho Mäkinen commented on CASSANDRA-5220: - In addition the repair operation gives poor status on its progress so it would be nice that some additional logging about repair progress would be added both to log4j and also to JMX. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Labels: performance, repair > Fix For: 2.1.1 > > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976385#comment-13976385 ] Jonathan Ellis commented on CASSANDRA-5220: --- bq. Send validation request once for all ranges, replica node builds MT for each range one by one, and sent back MT as it is built. This is a fairly straightforward extension, isn't it? I'd favor that approach. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Labels: performance, repair > Fix For: 2.1 beta2 > > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13971607#comment-13971607 ] Ryan McGuire commented on CASSANDRA-5220: - yourkit also listed some potential deadlocks, which apparently it doesn't save to the snapshot: {code} Frozen threads found (potential deadlock) It seems that the following threads have not changed their stack for more than 10 seconds. These threads are possibly (but not necessarily!) in a deadlock or hung. Thread-10 <--- Frozen for at least 48 sec sun.nio.ch.FileDispatcherImpl.read0(FileDescriptor, long, int) sun.nio.ch.SocketDispatcher.read(FileDescriptor, long, int) sun.nio.ch.IOUtil.readIntoNativeBuffer(FileDescriptor, ByteBuffer, long, NativeDispatcher) sun.nio.ch.IOUtil.read(FileDescriptor, ByteBuffer, long, NativeDispatcher) sun.nio.ch.SocketChannelImpl.read(ByteBuffer) sun.nio.ch.SocketAdaptor$SocketInputStream.read(ByteBuffer) sun.nio.ch.ChannelInputStream.read(byte[], int, int) org.xerial.snappy.SnappyInputStream.hasNextChunk() org.xerial.snappy.SnappyInputStream.read() java.io.DataInputStream.readInt() org.apache.cassandra.net.IncomingTcpConnection.handleModernVersion() org.apache.cassandra.net.IncomingTcpConnection.run() Thread-11 <--- Frozen for at least 1m 17 sec sun.nio.ch.FileDispatcherImpl.read0(FileDescriptor, long, int) sun.nio.ch.SocketDispatcher.read(FileDescriptor, long, int) sun.nio.ch.IOUtil.readIntoNativeBuffer(FileDescriptor, ByteBuffer, long, NativeDispatcher) sun.nio.ch.IOUtil.read(FileDescriptor, ByteBuffer, long, NativeDispatcher) sun.nio.ch.SocketChannelImpl.read(ByteBuffer) sun.nio.ch.SocketAdaptor$SocketInputStream.read(ByteBuffer) sun.nio.ch.ChannelInputStream.read(byte[], int, int) org.xerial.snappy.SnappyInputStream.hasNextChunk() org.xerial.snappy.SnappyInputStream.read() java.io.DataInputStream.readInt() org.apache.cassandra.net.IncomingTcpConnection.handleModernVersion() org.apache.cassandra.net.IncomingTcpConnection.run() Thread-12 <--- Frozen for at least 48 sec sun.nio.ch.FileDispatcherImpl.read0(FileDescriptor, long, int) sun.nio.ch.SocketDispatcher.read(FileDescriptor, long, int) sun.nio.ch.IOUtil.readIntoNativeBuffer(FileDescriptor, ByteBuffer, long, NativeDispatcher) sun.nio.ch.IOUtil.read(FileDescriptor, ByteBuffer, long, NativeDispatcher) sun.nio.ch.SocketChannelImpl.read(ByteBuffer) sun.nio.ch.SocketAdaptor$SocketInputStream.read(ByteBuffer) sun.nio.ch.ChannelInputStream.read(byte[], int, int) org.xerial.snappy.SnappyInputStream.hasNextChunk() org.xerial.snappy.SnappyInputStream.read() java.io.DataInputStream.readInt() org.apache.cassandra.net.IncomingTcpConnection.handleModernVersion() org.apache.cassandra.net.IncomingTcpConnection.run() Thread-13 <--- Frozen for at least 48 sec sun.nio.ch.FileDispatcherImpl.read0(FileDescriptor, long, int) sun.nio.ch.SocketDispatcher.read(FileDescriptor, long, int) sun.nio.ch.IOUtil.readIntoNativeBuffer(FileDescriptor, ByteBuffer, long, NativeDispatcher) sun.nio.ch.IOUtil.read(FileDescriptor, ByteBuffer, long, NativeDispatcher) sun.nio.ch.SocketChannelImpl.read(ByteBuffer) sun.nio.ch.SocketAdaptor$SocketInputStream.read(ByteBuffer) sun.nio.ch.ChannelInputStream.read(byte[], int, int) org.xerial.snappy.SnappyInputStream.hasNextChunk() org.xerial.snappy.SnappyInputStream.read() java.io.DataInputStream.readInt() org.apache.cassandra.net.IncomingTcpConnection.handleModernVersion() org.apache.cassandra.net.IncomingTcpConnection.run() Thread-3 <--- Frozen for at least 1m 21 sec sun.nio.ch.FileDispatcherImpl.read0(FileDescriptor, long, int) sun.nio.ch.SocketDispatcher.read(FileDescriptor, long, int) sun.nio.ch.IOUtil.readIntoNativeBuffer(FileDescriptor, ByteBuffer, long, NativeDispatcher) sun.nio.ch.IOUtil.read(FileDescriptor, ByteBuffer, long, NativeDispatcher) sun.nio.ch.SocketChannelImpl.read(ByteBuffer) sun.nio.ch.SocketAdaptor$SocketInputStream.read(ByteBuffer) sun.nio.ch.ChannelInputStream.read(byte[], int, int) org.xerial.snappy.SnappyInputStream.hasNextChunk() org.xerial.snappy.SnappyInputStream.read() java.io.DataInputStream.readInt() org.apache.cassandra.net.IncomingTcpConnection.handleModernVersion() org.apache.cassandra.net.IncomingTcpConnection.run() Thread-7 <--- Frozen for at least 1m 21 sec sun.nio.ch.FileDispatcherImpl.read0(FileDescriptor, long, int) sun.nio.ch.SocketDispatcher.read(FileDescriptor, long, int) sun.nio.ch.IOUtil.readIntoNativeBuffer(FileDescriptor, ByteBuffer, long, NativeDispatcher) sun.nio.ch.IOUtil.read(FileDescriptor, ByteBuffer, long, NativeDispatcher) sun.nio.ch.SocketChannelImpl.read(ByteBuffer) sun.nio.ch.SocketAdaptor$SocketInputStream.read(ByteBuffer) sun.nio.ch.ChannelInputStream.read(byte[], int, int) org.xerial.snappy.SnappyInputStream.hasNextChunk() org.xerial.snappy.SnappyInputStream.read() java.io.DataInputStream.readInt() org.apache.cassandra.net.IncomingTcpConnection.handleModernVersion() org.
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13971595#comment-13971595 ] Yuki Morishita commented on CASSANDRA-5220: --- Thanks, Ryan. Time increase in Incoming/OutboundTcpConnection indicate repair is spending more time in messaging. It is understandable the messaging is taking more than 200x for repairing 256x ranges. One possible solutin is to repair multiple ranges at once. I have two ideas in my mind: # Build two-level MerkleTree of multiple ranges. In the lower level we have regular, per range MT and in the upper level, we have MT whose leaf is root hash of lower MT. So we can carry multiple MT in one round trip of message. # Send validation request once for all ranges, replica node builds MT for each range one by one, and sent back MT as it is built. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Labels: performance, repair > Fix For: 2.1 beta2 > > Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13971505#comment-13971505 ] Lyuben Todorov commented on CASSANDRA-5220: --- I'll have a shot at adding in some logging into the repair process to see if we can get a better idea of how much time is being spend in the different repair stages. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Labels: performance, repair > Fix For: 2.1 beta2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13971465#comment-13971465 ] Jonathan Ellis commented on CASSANDRA-5220: --- I just talked to some people who were seeing an 8 node (256 vnodes each) repair with about 1GB/node take *two days*. I would suggest doing some more digging to see where all the overhead is coming from, before guessing at solutions. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Labels: performance, repair > Fix For: 2.1 beta2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969613#comment-13969613 ] Richard Low commented on CASSANDRA-5220: It's going to be a lot slower when there's little data because there is num_tokens times as much work to do. But when there is lots of data the times should be pretty much independent of num_tokens because most of repair is spent reading data and hashing. I ran some tests when we were developing vnodes (sorry, I don't have the data still available) and this was the case. Something might have regressed though. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Fix For: 2.1 beta2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13950268#comment-13950268 ] Brandon Williams commented on CASSANDRA-5220: - After talking with Ryan, I'm convinced that I just didn't have an accurate measure of actual repair time when I filed this, and the problem is even worse than I initially thought :( > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Fix For: 2.1 beta2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13950239#comment-13950239 ] Brandon Williams commented on CASSANDRA-5220: - bq. without vnodes: Repair time: 5.10s That honestly sounds too fast to be believable to me, when I was tracking the time on the dtests repair was always one of, if not the, longest one. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Fix For: 2.1 beta2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13950222#comment-13950222 ] Jonathan Ellis commented on CASSANDRA-5220: --- Worth bisecting? > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Fix For: 2.1 beta2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13949909#comment-13949909 ] Ryan McGuire commented on CASSANDRA-5220: - As of today, on cassandra-2.0 HEAD repair_test.TestRepair.simple_repair_test: bq.without vnodes: Repair time: 5.10s bq.with vnodes: Repair time: 562.97s 100x slower than without vnodes. So I'm not sure what happened here since @driftx ran this in November. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Fix For: 2.1 beta2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13923229#comment-13923229 ] Robert Coli commented on CASSANDRA-5220: {quote}So we're 3-3.5x slower in the simple case.{quote} So, if : 1) the default for gc_grace_seconds is how frequently we want people to repair 2) and vnodes make repair 3-3.5x slower in the simple case 3) and vnodes are enabled by default 4) why has the default for gc_grace_seconds not been increased by 3-3.5x? (CASSANDRA-5850) > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Fix For: 2.1 beta2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855774#comment-13855774 ] Donald Smith commented on CASSANDRA-5220: - We ran "nodetool repair" on a 3 node cassandra cluster with production-quality hardware, using version 2.0.3. Each node had about 1TB of data. This is still testing. After 5 days the repair job still hasn't finished. I can see it's still running. Here's the process: {noformat} root 30835 30774 0 Dec17 pts/000:03:53 /usr/bin/java -cp /etc/cassandra/conf:/usr/share/java/jna.jar:/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/apache-cassandra-2.0.3.jar:/usr/share/cassandra/lib/apache-cassandra-clientutil-2.0.3.jar:/usr/share/cassandra/lib/apache-cassandra-thrift-2.0.3.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-15.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/netty-3.6.6.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.jar:/usr/share/cassandra/lib/snaptree-0.1.jar:/usr/share/cassandra/lib/stress.jar:/usr/share/cassandra/lib/thrift-server-0.3.2.jar -Xmx32m -Dlog4j.configuration=log4j-tools.properties -Dstorage-config=/etc/cassandra/conf org.apache.cassandra.tools.NodeCmd -p 7199 repair -pr as_reports {noformat} The log output has just: {noformat} xss = -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms8192M -Xmx8192M -Xmn2048M -XX:+HeapDumpOnOutOfMemoryError -Xss256k [2013-12-17 23:26:48,144] Starting repair command #1, repairing 256 ranges for keyspace as_reports {noformat} Here's the output of "nodetool tpstats": {noformat} cass3 /tmp> nodetool tpstats xss = -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms8192M -Xmx8192M -Xmn2048M -XX:+HeapDumpOnOutOfMemoryError -Xss256k Pool NameActive Pending Completed Blocked All time blocked ReadStage 1 0 38083403 0 0 RequestResponseStage 0 0 1951200451 0 0 MutationStage 0 0 2853354069 0 0 ReadRepairStage 0 03794926 0 0 ReplicateOnWriteStage 0 0 0 0 0 GossipStage 0 04880147 0 0 AntiEntropyStage 1 3 9 0 0 MigrationStage0 0 30 0 0 MemoryMeter 0 0115 0 0 MemtablePostFlusher 0 0 75121 0 0 FlushWriter 0 0 49934 0 52 MiscStage 0 0 0 0 0 PendingRangeCalculator0 0 7 0 0 commitlog_archiver0 0 0 0 0 AntiEntropySessions 1 1 1 0 0 InternalResponseStage 0 0 9 0 0 HintedHandoff 0 0 1141 0 0 Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 PAGED_RANGE 0 BINARY 0 READ 884 MUTATION 1407711 _TRACE 0 REQUEST_RESPONSE 0 {noformat} The cluster has some write traffic to it. We decided to test it under load. This is the busiest column family, as reported by "nodetool cfstats": {n
[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13570751#comment-13570751 ] Yuki Morishita commented on CASSANDRA-5220: --- The reason the repair is done almost sequentially is to synchronize merkle tree creation across the nodes(CASSANDRA-2816). If we could form the groups of nodes that do not overlap for several ranges, we would be able to parallelize create/validate merkle tree. > Repair improvements when using vnodes > - > > Key: CASSANDRA-5220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: Brandon Williams >Assignee: Yuki Morishita > Fix For: 1.2.2 > > > Currently when using vnodes, repair takes much longer to complete than > without them. This appears at least in part because it's using a session per > range and processing them sequentially. This generates a lot of log spam > with vnodes, and while being gentler and lighter on hard disk deployments, > ssd-based deployments would often prefer that repair be as fast as possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira