[ 
https://issues.apache.org/jira/browse/CASSANDRA-19652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860868#comment-17860868
 ] 

Dmitry Konstantinov commented on CASSANDRA-19652:
-------------------------------------------------

The initial version of the optimization: 
https://github.com/netudima/cassandra/commit/ef66ff9aac5d6613e3cada89edcaa9bec8e46ad8

> ShallowInfoRetriever: cache offsets to void resetting of RandomAccessReader 
> buffer
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19652
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19652
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/SSTable
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>             Fix For: 5.x
>
>
>  Currently in 
> org.apache.cassandra.io.sstable.format.big.RowIndexEntry.ShallowInfoRetriever#fetchIndex
>  we do 2 seek/read operations: 1st is to find the offset for IndexInfo and 
> the 2nd to read it. These are two quite distant regions of the file and for 
> standard disk access mode we do not use a benefit from a buffer in 
> RandomAccessReader due to jumping between the regions and reseting this 
> buffer again and again. A possible improvement here can be to read and cache 
> N first offsets (to limit the amount of memory to use) on the first read and 
> do later only sequential reads of IndexInfo data. By caching of less than 1Kb 
> we can reduce the number of syscalls even more, in my case: from few hundred 
> to less than 10.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to