[ 
https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067860#comment-13067860
 ] 

Sylvain Lebresne commented on CASSANDRA-47:
-------------------------------------------

{quote}
The thing about Input/Output classes was mentioned previously at 
CASSANDRA-1470. I -1 doing "seek() method throw an exception if the CDF has 
been opened in "rw" mode" because this is not a clean interface but I rather 
prefer to make separate classes as that will be a more reasonable and clean 
design. Anyway, even right now common ancestor of both is RandomAccessFile (or 
even FileDataInput). So I -1 doing merge of CDF and BRAF before we have a BRAF 
refactored.
{quote}

I don't understand that argument. BRAF and CDF do the same thing, they only 
differ in that CDF has a decompression/compression step while moving data 
in/out of the buffer and has a slight translation between which part of the 
file to buffer. The rest of the code is the exact same, all the buffer 
manipulation, when to sync, when to rebuffer, etc.. is the same. And it's not 
the simplest code ever, not a place where having code duplication sound like a 
good idea.

{quote}
 I'm a bit conserved about adding one more file to handle a single SSTable, 
main goal of my design here was to make CDF independent from other components 
of the system to avoid any additional complexity
{quote}

I don't see why adding a new component adds any complexity. I actually find it 
rather cleaner, as that component would likely nicely correspond to an 
in-memory object holding all the metadata related to compression.

{quote}
maybe it's better to stream file offsets to the temporary file while SSTable 
being written and after that store index section at the end of the file
{quote}

If what you mean is what I understand that sound way more complicated that 
having a separate component.

{quote}
We can use a magic number the same way as gzip does 
http://en.wikipedia.org/wiki/Gzip#File_format.
{quote}

That wouldn't be more reliable than the control bytes.

> SSTable compression
> -------------------
>
>                 Key: CASSANDRA-47
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-47
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Pavel Yaskevich
>              Labels: compression
>             Fix For: 1.0
>
>         Attachments: CASSANDRA-47-v2.patch, CASSANDRA-47.patch, 
> snappy-java-1.0.3-rc4.jar
>
>
> We should be able to do SSTable compression which would trade CPU for I/O 
> (almost always a good trade).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to