[ https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pavel Yaskevich updated CASSANDRA-47: ------------------------------------- Attachment: CASSANDRA-47-v4-fixes.patch bq. We need to do something for this ticket. Right now, if someone call the public flush or sync methods of CSW mistakenly, it will corrupt data. So we should at least have the public flush and sync throw an UnsupportedOperationException. I'm not saying it will be particularly clean, but I'll take "slightly ugly and safe" over "cleaner but dangerous" anytime. I'm fine with leaving a "cleaner" refactoring of this to a separate task though. CSW sync() and flush() both throw UnsupportedOperationException now. bq. resetAndTruncate is still a problem. ... Fixed by using information from metadata file (to avoid keeping information about chunks in memory). bq. In CSW, when resetBuffer is called, current is supposed to either be "aligned" on chunk boundary or we're closing the file. So it seems there is no need to realign bufferOffset, and thus no need to override resetBuffer. Fixed. bq. The truncateAndClose of CSW doesn't seem to truncate anything. It also doesn't honor skipIOCache correctly since it doesn't call the truncateAndClose of SW. But actually, I think that if the only backward seek we do is through resetAndTruncate, then there is no need to truncate on close (neither for SW nor CSW). So we should probably get rid of that function and move the relevant parts in close(). truncateAndClose method is removed, relevant parts moved to close(). bq. Let's use readUTF/writeUTF to read/write the algorithm name in the metadata file. That's what we use for strings usually (and using a StringBuilder to read a string is a tad over the top). Done. bq. CompressionMetadata.readChunkOffsets() is buggy if the dataLength is a multiple of the chunckLength (we have one less chunk that what's computed then). Fixed but storing information about chunk count in the index file so we no longer need to count anything. bq. No need to reset validBufferBytes in CSW.flushData(). It's done in resetBuffer (and not in SW.flushData(), so that'll improve symmetry). Fixed. bq. Chunk could be a static class in CompressionMetada I suppose. Done. > SSTable compression > ------------------- > > Key: CASSANDRA-47 > URL: https://issues.apache.org/jira/browse/CASSANDRA-47 > Project: Cassandra > Issue Type: New Feature > Components: Core > Reporter: Jonathan Ellis > Assignee: Pavel Yaskevich > Labels: compression > Fix For: 1.0 > > Attachments: CASSANDRA-47-v2.patch, CASSANDRA-47-v3-rebased.patch, > CASSANDRA-47-v3.patch, CASSANDRA-47-v4-fixes.patch, CASSANDRA-47-v4.patch, > CASSANDRA-47.patch, snappy-java-1.0.3-rc4.jar > > > We should be able to do SSTable compression which would trade CPU for I/O > (almost always a good trade). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira