[ https://issues.apache.org/jira/browse/CASSANDRA-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863620#action_12863620 ]
Schubert Zhang commented on CASSANDRA-1041: ------------------------------------------- And another reason I add this size-threshold is because the limitation of current implementation: "Cassandra's compaction code currently deserializes an entire row (per columnfamily) at a time." Use a size-threshold, we can temporarily restrict the size of row not too large. In fact, we are also trying to fix the limitation: "Cassandra's compaction code currently deserializes an entire row (per columnfamily) at a time." Our imagine is iterate on columns in stead of rows. > Skip large size (Configurable) SSTable in minor or/and major compaction > ----------------------------------------------------------------------- > > Key: CASSANDRA-1041 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1041 > Project: Cassandra > Issue Type: New Feature > Components: Core > Reporter: Schubert Zhang > Priority: Minor > Attachments: CASSANDRA-1041-0.6.1.patch, CASSANDRA-1041-0.6.patch > > > When the SSTable files are large enough, such as 100GB, the compaction > (include minor and major) cost is big (disk IO, CPU, memory), etc. > In some applications, we accept not compcating all SSTables to the final very > large ones. > This feature provide two optional configurable attributes > MinorCompactSkipInGB and MajorCompactSkipInGB for each ColumnFamily. > The optional MinorCompactSkipInGB attribute specifies the maximum size of > SSTables which will be compcated in minor-compaction. The SSTables larger > than MinorCompactSkipInGB will be skipped. The optional MajorCompactSkipInGB > attribute is same for major-compaction. > The default of these attributes are 0, means do not skip, just as current > 0.6.1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.