[ https://issues.apache.org/jira/browse/CASSANDRA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032796#comment-13032796 ]
Stu Hood commented on CASSANDRA-1610: ------------------------------------- * Have the AbstractCompactionStrategy class return the default strategy for use in CFMetaData * createCompactionStrategyInstance should use FBUtilities.construct(class, readable) * Unnecessary method renames in CFMetaData * CompactionStrategy instantiation in DatabaseDescriptor duplicates the instantiation in CFMetaData: see what could be put into FBUtilities * Whitespace changes in db.ColumnFamily * Unnecessary ByteBufferUtil import in ColumnIndexer * Are you sure we can remove the major compaction file size threshold? * Need to special case the 'expired' directory in SSTable.tryComponentFromFilename * handleInsufficientSpaceForCompaction should move inside getBuckets (as mentioned in your TODOs): it would be best if the strategy logged at info/warn for files that don't fit into a bucket that matches the parameters * re: the TODO in doExpireCompaction: For correctness' sake, we'll need to invalidate row cache entries that match the expired files, but I would be fine doing that in a separate ticket, because it'll be a little bit involved * Try to remove TODOs that are speculative: if there are tasks that are blockers for this ticket, list them here. If they aren't blockers for this ticket, but are worthy tasks, they should be moved into tickets before this is committed * Please parse the options for TimestampBucketedCompactionStrategy in the constructor * One or two comments explaining the bucketing strategy for TimestampBucketed.getBuckets would be helpful * Methods that are public only for testing should be package protected (cf. getBuckets) * Seconds would make a better unit for expiration than days * See if you can find a way to remove some of the duplication between selectFor(Minor|Major) * The AbstractCompactedRow sstableStats reference should move into SSTableWriter... to collect information about a row as it is appended to the writer, you'll probably want to pass it to AbstractCompactedRow.write(file, <stats>). There is an example approach on 2319 * useOldStatsFile should be descriptive * Rename SSTableStats to SSTableMetadata * Unnecessary imports in ByteBufferUtil * Regarding the disabled test in DatabaseDescriptorTest: it's probably because Maps of CharSequence will not equal one another if one contains UTF8s and the other contains Strings: see if we have another round trip test, and then consider removing that one. It's not the first time its come up Awesome work Alan! > Pluggable Compaction > -------------------- > > Key: CASSANDRA-1610 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1610 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Chris Goffinet > Assignee: Alan Liang > Priority: Minor > Labels: compaction > Fix For: 1.0 > > Attachments: 0001-move-compaction-code-into-own-package.patch, > 0002-Pluggable-Compaction-and-Expiration.patch > > > In CASSANDRA-1608, I proposed some changes on how compaction works. I think > it also makes sense to allow the ability to have pluggable compaction per CF. > There could be many types of workloads where this makes sense. One example we > had at Digg was to completely throw away certain SSTables after N days. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira