[ https://issues.apache.org/jira/browse/CASSANDRA-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378235#comment-16378235 ]
Oleksandr Shulgin edited comment on CASSANDRA-14210 at 2/27/18 8:35 AM: ------------------------------------------------------------------------ We are observing a very similar problem with ordinary compaction. Not sure if the proposed change could cover both (with the difference that in compaction you likely want to start with the smallest tables first, but this is up to the actual compaction strategy). A node runs with {{concurrent_compactors=2}} and is doing a rather big compaction (> 200 GB) on a table. At the same time, a lot of small files are streamed in by repair, for a different table. Number of {{*-Data.db}} files for that other table grows as high as 5,500 and estimated number of pending compaction tasks for this node jumps to over 180. But no compaction is started for the table with a lot of small data files, up until the only current compaction task finishes. Why is that so? I would expect that a free compaction slot is utilized immediately for new tasks. was (Author: oshulgin): We are observing a very similar problem with ordinary compaction. Not sure if the proposed change could cover both (with the difference that in compaction you likely want to start with the smallest tables first, but this is up to the actual compaction strategy). A node runs with {{concurrent_compactors=2}} and is doing a rather big compaction (> 200 GB) on a table. At the same time, a lot of small files are streamed in by repair, for a different table. Number of {{*-Data.db}} for that other table grows as high as 5,500 and estimated number of pending compaction tasks for this node jumps to over 180. But no compaction is started for the table with a lot of small data files, up until the only current compaction task finishes. Why is that so? I would expect that a free compaction slot is utilized immediately for new tasks. > Optimize SSTables upgrade task scheduling > ----------------------------------------- > > Key: CASSANDRA-14210 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14210 > Project: Cassandra > Issue Type: Improvement > Components: Compaction > Reporter: Oleksandr Shulgin > Assignee: Kurt Greaves > Priority: Major > Fix For: 4.x > > > When starting the SSTable-rewrite process by running {{nodetool > upgradesstables --jobs N}}, with N > 1, not all of the provided N slots are > used. > For example, we were testing with {{concurrent_compactors=5}} and {{N=4}}. > What we observed both for version 2.2 and 3.0, is that initially all 4 > provided slots are used for "Upgrade sstables" compactions, but later when > some of the 4 tasks are finished, no new tasks are scheduled immediately. It > takes the last of the 4 tasks to finish before new 4 tasks would be > scheduled. This happens on every node we've observed. > This doesn't utilize available resources to the full extent allowed by the > --jobs N parameter. In the field, on a cluster of 12 nodes with 4-5 TiB data > each, we've seen that the whole process was taking more than 7 days, instead > of estimated 1.5-2 days (provided there would be close to full N slots > utilization). > Instead, new tasks should be scheduled as soon as there is a free compaction > slot. > Additionally, starting from the biggest SSTables could further reduce the > total time required for the whole process to finish on any given node. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org