[ 
https://issues.apache.org/jira/browse/CASSANDRA-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2398:
--------------------------------

    Attachment: 0002-CASSANDRA-2398-Type-specific-compression-for-counters.txt
                0001-CASSANDRA-2398-Add-type-specific-compression-to-Abstra.txt

Modified the compression interface to deal with ByteBuffers, and added support 
for compression of CounterColumnType. The compression ratios for examples in 
the unit tests are:
{quote}
# with CounterColumnType specific compression
2.745098 for 4 values (inbytes: 140, outbytes: 51)
4.7719297 for 4 values (inbytes: 272, outbytes: 57)
5.9710145 for 8 values (inbytes: 412, outbytes: 69)
5.415465 for 10000 values (inbytes: 350034, outbytes: 64636)

# with generic LZF compression
2.5 for 4 values (inbytes: 140, outbytes: 56)
4.1846156 for 4 values (inbytes: 272, outbytes: 65)
4.2916665 for 8 values (inbytes: 412, outbytes: 96)
2.3148732 for 10000 values (inbytes: 349944, outbytes: 151172)
{quote}

> Type specific compression
> -------------------------
>
>                 Key: CASSANDRA-2398
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2398
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Stu Hood
>              Labels: compression
>             Fix For: 1.0
>
>         Attachments: 
> 0001-CASSANDRA-2398-Add-type-specific-compression-to-Abstra.txt, 
> 0002-CASSANDRA-2398-Type-specific-compression-for-counters.txt
>
>
> Cassandra has a lot of locations that are ripe for type specific compression. 
> A short list:
> Indexes
>  * Keys compressed as BytesType, which could default to LZO/LZMA
>  * Offsets (delta and varint encoding)
>  * Column names added by 2319
> Data
>  * Keys, columns, timestamps: see 
> http://wiki.apache.org/cassandra/FileFormatDesignDoc
> A basic interface for type specific compression could be as simple as:
> {code:java}
> public void compress(int version, final List<ByteBuffer> from, DataOutput to) 
> throws IOException
> public void decompress(int version, DataInput from, List<ByteBuffer> to) 
> throws IOException
> public void skip(int version, DataInput from) throws IOException
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to