[ https://issues.apache.org/jira/browse/CASSANDRA-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871330#action_12871330 ]
Stu Hood commented on CASSANDRA-1117: ------------------------------------- The MMAP segmenting support in SSTableReader was buggy because the implementation in trunk is extremely complicated. RowIndexReader.getPosition for instance, is 110 lines long. The MMAP logic for finding segment boundaries is implemented twice in trunk: once for the index, and once for the data file. Additionally, the segment support is not abstracted enough to support using MMAP'd reads for calls to getNearestPosition, which always fall back to buffered files. This patchset addresses all of these issues, and sets us up to be able to use MMAP'd reads for Scanners (which currently always use buffered files). Also, the set adds support for iterating over the segments in a file, which getNearestPosition punts on entirely, and getPosition special cases. Finally, because the segmenting is extracted from RowIndexedReader, it will be much easier to reuse for the new API in 998 and the new file format for 674. Eventually. While this patchset isn't a huge win for performance (my quick tests showed a 5% drop), I think it is a huge win for clarity and reusability. > Clean up MMAP support > --------------------- > > Key: CASSANDRA-1117 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1117 > Project: Cassandra > Issue Type: Improvement > Reporter: Stu Hood > Assignee: Gary Dusbabek > Fix For: 0.7 > > Attachments: 0001-Use-factory-functions-for-RowIndexedReader.patch, > 0002-Add-SegmentedFile-to-abstract-opening-FileDataInputs.patch, > 0003-Replace-mmap-file-abstraction-with-SegmentedFile.patch, > 0004-Rename-SSTableReaderTest-to-SegmentedFileTest.patch, > 0005-Remove-filename-munging.patch > > > Awareness of MMAP is currently embedded into the SSTableReader implementation > and IndexSummary. A good number of bugs experienced recently have been due to > this lack of separation, so it is ripe for abstraction. Additionally, the > current implementation does not provide a good method for iterating over the > segments of a file, which is useful for range queries, and lays more stable > groundwork for #998. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.