[ https://issues.apache.org/jira/browse/CASSANDRA-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rick Branson updated CASSANDRA-5569: ------------------------------------ Attachment: 5569.txt > Every stream operation requires checking indexes in every SSTable > ----------------------------------------------------------------- > > Key: CASSANDRA-5569 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5569 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 1.2.4 > Reporter: Rick Branson > Assignee: Rick Branson > Fix For: 1.2.4 > > Attachments: 5569.txt > > > It looks like there's a streaming performance issue when leveled compaction > and vnodes get together. To get the candidate set of chunks to stream, the > streaming system gets references to every SSTable for a CF. This is probably > a perfectly reasonable assumption for non-vnode cases, because the data being > streamed is likely distributed across the full SSTable set. This is also > probably a perfectly reasonable assumption for size-tiered compaction, > because the data is, again, likely distributed across the full SSTable set. > However, for each vnode repair performed on LCS CF's, this scan across > potentially tens of thousands of SSTables is wasteful considering that only a > small percentage of them will actually have data for a given range. > This manifested itself as "hanging" repair operations with tasks backing up > on the MiscStage thread pool. > The attached patch changes the streaming code so that for a given range, only > SSTables for the requested range are checked to be included in streaming. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira