[ https://issues.apache.org/jira/browse/CASSANDRA-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658440#comment-13658440 ]
Rick Branson commented on CASSANDRA-5569: ----------------------------------------- http://github.com/apache/cassandra/blob/cassandra-1.2/src/java/org/apache/cassandra/streaming/StreamingRepairTask.java#L133 ^^^ is there a good reason we're calling StreamOut.transferSSTables directly instead of StreamOut.transferRanges? > Every stream operation requires checking indexes in every SSTable > ----------------------------------------------------------------- > > Key: CASSANDRA-5569 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5569 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 1.2.0 > Reporter: Rick Branson > Assignee: Rick Branson > Priority: Minor > Labels: streaming > Fix For: 1.2.5 > > Attachments: 5569.txt, 5569-v2.txt > > > It looks like there's a streaming performance issue when leveled compaction > and vnodes get together. To get the candidate set of chunks to stream, the > streaming system gets references to every SSTable for a CF. This is probably > a perfectly reasonable assumption for non-vnode cases, because the data being > streamed is likely distributed across the full SSTable set. This is also > probably a perfectly reasonable assumption for size-tiered compaction, > because the data is, again, likely distributed across the full SSTable set. > However, for each vnode repair performed on LCS CF's, this scan across > potentially tens of thousands of SSTables is wasteful considering that only a > small percentage of them will actually have data for a given range. > This manifested itself as "hanging" repair operations with tasks backing up > on the MiscStage thread pool. > The attached patch changes the streaming code so that for a given range, only > SSTables for the requested range are checked to be included in streaming. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira