[ 
https://issues.apache.org/jira/browse/CASSANDRA-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872334#action_12872334
 ] 

Stu Hood commented on CASSANDRA-1117:
-------------------------------------

I got to thinking about Jonathan's 2-level binary search idea, and realized 
that a multiple level binary search would be handled really well by a tree.

The tree I'm imagining would be a tree of depth K+2 where K is the number of 
index/data files (2 in our current situation). The 0th level would be a root. 
At each of the K levels after the root, you would have inner nodes representing 
the segments of the index/data file at that level. The 1st level would contain 
the segments for the smallest file, the 2nd level would contain the segments 
for the second smallest, and the Kth would contain the segments for the data 
file. The K+1th level would contain leaf nodes which would be equivalent to the 
contents of the IndexSummary class.

I thiiink I can implement this structure over the weekend if it sounds 
worthwhile?

Also, generalizing to multiple levels of indexing means that at some point in 
the future, we could write out multiple index files at progressively higher 
resolution, giving you a balanced tree on disk. Our INDEX_INTERVAL is intended 
to represent the ratio between ram and disk, so theoretical you should always 
have enough memory to summarize the index in memory, but in most cases, a lot 
of that memory would be better served as row cache.

> Clean up MMAP support
> ---------------------
>
>                 Key: CASSANDRA-1117
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1117
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Stu Hood
>            Assignee: Gary Dusbabek
>             Fix For: 0.7
>
>         Attachments: 0001-Use-factory-functions-for-RowIndexedReader.patch, 
> 0002-Add-SegmentedFile-to-abstract-opening-FileDataInputs.patch, 
> 0003-Replace-mmap-file-abstraction-with-SegmentedFile.patch, 
> 0004-Rename-SSTableReaderTest-to-SegmentedFileTest.patch, 
> 0005-Remove-filename-munging.patch
>
>
> Awareness of MMAP is currently embedded into the SSTableReader implementation 
> and IndexSummary. A good number of bugs experienced recently have been due to 
> this lack of separation, so it is ripe for abstraction. Additionally, the 
> current implementation does not provide a good method for iterating over the 
> segments of a file, which is useful for range queries, and lays more stable 
> groundwork for #998.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to