[ 
https://issues.apache.org/jira/browse/CASSANDRA-270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-270:
-------------------------------------

    Attachment: 270.txt

The root of the problem is that writing a ColumnFamily [row data] object has 
been

 1. serialize the row to a DataOutput
 2. write the DataOutput's length, followed by the DataOutput's content, to the 
sstable file

Thus, there is an extra copy to the intermediate DataOutput before getting to 
the file.

This patch takes a different approach than Todd's: instead of using a clever 
DataOutput to avoid copying the byte[] inside Column objects, we change the 
algorithm to

 1. write a placeholder length value
 2. serialize the row directly to the sstable
 3. seek back and write the correct length now that we know it

If we seek within our BufferedRandomAccessFile buffer, we're not actually 
generating seek system calls, so there's basically no penalty for doing this, 
and we can guarantee that this is our situation by setting the buffer size to 
the InMemoryCompactionLimit (past which we do two passes already as explained 
in CASSANDRA-16).

(This did require a small modification to BRAF, which would flush whenever 
seeking backwards, which is unnecessarily pessimistic.)

> Reduce copies in data write path
> --------------------------------
>
>                 Key: CASSANDRA-270
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-270
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Todd Lipcon
>             Fix For: 0.7
>
>         Attachments: 270.txt, patches.tar
>
>
> This is a series of patches against a very old version of Cassandra - they 
> certainly won't apply, but Jonathan asked me to upload the patches here to do 
> the ASF grant.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to