[ https://issues.apache.org/jira/browse/CASSANDRA-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kurt Greaves updated CASSANDRA-14210: ------------------------------------- Status: Ready to Commit (was: Patch Available) > Optimize SSTables upgrade task scheduling > ----------------------------------------- > > Key: CASSANDRA-14210 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14210 > Project: Cassandra > Issue Type: Improvement > Components: Compaction > Reporter: Oleksandr Shulgin > Assignee: Kurt Greaves > Priority: Major > Fix For: 4.x > > > When starting the SSTable-rewrite process by running {{nodetool > upgradesstables --jobs N}}, with N > 1, not all of the provided N slots are > used. > For example, we were testing with {{concurrent_compactors=5}} and {{N=4}}. > What we observed both for version 2.2 and 3.0, is that initially all 4 > provided slots are used for "Upgrade sstables" compactions, but later when > some of the 4 tasks are finished, no new tasks are scheduled immediately. It > takes the last of the 4 tasks to finish before new 4 tasks would be > scheduled. This happens on every node we've observed. > This doesn't utilize available resources to the full extent allowed by the > --jobs N parameter. In the field, on a cluster of 12 nodes with 4-5 TiB data > each, we've seen that the whole process was taking more than 7 days, instead > of estimated 1.5-2 days (provided there would be close to full N slots > utilization). > Instead, new tasks should be scheduled as soon as there is a free compaction > slot. > Additionally, starting from the biggest SSTables could further reduce the > total time required for the whole process to finish on any given node. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org