[ https://issues.apache.org/jira/browse/CASSANDRA-10757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aleksey Yeschenko resolved CASSANDRA-10757. ------------------------------------------- Resolution: Duplicate It is indeed a duplicate of CASSANDRA-4756. Closing the ticket as such. > Cluster migration with sstableloader requires significant compaction time > ------------------------------------------------------------------------- > > Key: CASSANDRA-10757 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10757 > Project: Cassandra > Issue Type: Improvement > Components: Compaction, Streaming and Messaging > Reporter: Juho Mäkinen > Priority: Minor > Labels: sstableloader > > When sstableloader is used to migrate data from a cluster into another the > loading creates a lot more data and a lot more sstable files than what the > original cluster had. > For example in my case a 62 node with 16 TiB of data in 80000 sstables was > sstableloaded into another cluster with 36 nodes and this resulted with 42 > TiB of used data in a whopping 350000 sstables. > The sstableloadering process itself was relatively fast (around 8 hours), but > in the result the destination cluster needs approximately two weeks of > compaction to be able to reduce the number of sstables back to the original > levels. (The instances are C4.4xlarge in EC2, 16 cores each, compaction > running on 14 cores. the EBS disks in each instance provide 9000 iops and max > 250 MiB/sec disk bandwidth.). > Could sstableloader process somehow improved to make this kind of migrations > less painful and faster? -- This message was sent by Atlassian JIRA (v6.3.4#6332)