[ 
https://issues.apache.org/jira/browse/CASSANDRA-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17095889#comment-17095889
 ] 

David Capwell commented on CASSANDRA-15396:
-------------------------------------------

||PR||tests||
|[link|https://github.com/apache/cassandra/pull/378]|[link|https://circleci.com/gh/dcapwell/workflows/cassandra/tree/randomAccessReaderSkipBytes]|

In my testing, the following cases improve by relying on seek:

1) compressed data (avoid decompressing when not needed)
2) non-mmap data (avoid the extra extra io)

The cases which did not improve noticeably (but also did not regress):

1) mmap data which is in the page cache (mmap rebuffer doesn’t copy into 
buffers, instead  switches the position of the buffer.  This patch lowers the 
amount of times we update the buffer position)

Plan to run stress tests as well, but still learning the Data.db read path to 
figure out which patterns call .skipByte*


> RAR does not override skip or skipBytes so will do a lot of disk io when n is 
> large
> -----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15396
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15396
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/SSTable
>            Reporter: David Capwell
>            Assignee: David Capwell
>            Priority: Normal
>
> RandomAccessReader does not override skip or skipBytes which becomes a 
> problem when the size of n (the bytes to skip) is larger than a single 
> buffer; in these cases we can rely on seek to avoid the extra disk io.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to