[ https://issues.apache.org/jira/browse/CASSANDRA-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153607#comment-14153607 ]
Jonathan Ellis commented on CASSANDRA-7994: ------------------------------------------- Marking duplicate to link the two. Leaving both open until we decide which approach to go with. > Commit logs on the fly compression > ----------------------------------- > > Key: CASSANDRA-7994 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7994 > Project: Cassandra > Issue Type: New Feature > Reporter: Oleg Anastasyev > Attachments: CompressedCommitLogs-7994.txt > > > This patch employs lz4 algo to comress commit logs. This could be useful to > conserve disk space either archiving commit logs for a long time or for > conserviing iops for use cases with often and large mutations updating the > same record. > The compression is performed on blocks of 64k, for better cross mutation > compression. CRC is computed on each 64k block, unlike original code > computing it on each individual mutation. > On one of our real production cluster this saved 2/3 of the space consumed by > commit logs. The replay is 20-30% slower for the same number of mutations. > While doing this, also refactored commit log reading code to CommitLogReader > class, which i believe makes code cleaner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)