Cassandra decompress already compressed data

Мириан Джачвадзе Tue, 21 Feb 2017 12:00:47 -0800

The question related to C* driver and C* server itself. I'll ask it here
and at C* driver developers group too. Hope somebody can give me an answer
or maybe give an explanation why it happens as it happens.



This is the story about cassandra 2.2.7(I've checked it with the C* 3.* and
result is the same) and driver 3.0.2.


I am playing with blob in this table:

CREATE TABLE compression_test (

    id uuid,

    chunk int,

    data blob,

    PRIMARY KEY (id, chunk))

WITH compression = { 'sstable_compression' : '' }"


Java cassandra driver has a compression option out of the box. It can
compress data before send it to socket. I am thinking that driver compress
source byte array, send it to C* and after processing it will be stored as
is to SSTable. That is why the table configured without compression on
SSTable level. But it's only my proposal. During investigation I found out
that I was wrong.


In my case I choose LZ4 compression in driver and 0.5G data was compressed
down to 40Mb. Before send data to socket driver compress it, set
COMPRESSED flag
at the beginning of the compressed array and sent to C*  server. Through
debug I see that exactly 40Mb was written to socket.


Meanwhile C* server has Server.Initial class that initialize netty
ChannelPipelineand append severalChannelHandler's to it. One of the
ChannelHandleris Frame.Decompressor. It check that given frame has a
COMPRESSED flag and decompress data in case it exist.  Also debugging I see
that C* received exactly 40Mb chunk, find COMPRESSED flag, decompress it
and process it. It leads to extra memory and disk consumption. I'm not sure
how it's stored in mem table but absolutely sure that 0.5G of decompressed
data is stored on disk driver. And it's not what I want. I can set LZ4
compressor for SSTable and the received chunk will be stored compressed but
it's also not what I want.


For what purpose C* decompress already compressed data? I am miss something
in C* or driver configuration?

Cassandra decompress already compressed data

Reply via email to