[jira] [Commented] (CASSANDRA-11218) Prioritize Secondary Index rebuild

Jeff Jirsa (JIRA) Wed, 02 Nov 2016 22:30:08 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15631663#comment-15631663
 ]


Jeff Jirsa commented on CASSANDRA-11218:
----------------------------------------

Thanks much for the review [~beobal] 

Pushed 
https://github.com/jeffjirsa/cassandra/commit/96288f42f36ad80bff186707f0f7234dd0a9a8d9
 to address your comments (same branch 
https://github.com/jeffjirsa/cassandra/commits/cassandra-11218-3.X ) . 

CI @ http://cassci.datastax.com/job/jeffjirsa-cassandra-11218-3.X-dtest/ and 
http://cassci.datastax.com/job/jeffjirsa-cassandra-11218-3.X-testall/ (started 
evening of Nov 2, should be ready by Nov 3, depending on when you come back to 
this).  

Some notes:

{quote}
The instanceof checks and special casing to handle regular vs prioritized 
runnable is also duplicated between the newTaskFor methods and the comparator. 
Could we make CompactionExecutor::newTaskFor always return an 
IPrioritizedCompactionFutureTask, so that when the supplied runnable/callable 
is already prioritized it just uses its existing values, and when a 
non-prioritized runnable/callable is received it makes it prioritized with the 
default 0/0/0 values (which is functionally equivalent to what the comparator 
will do anyway). Then the comparator can be further simplified by not having to 
consider non-prioritized tasks.
{quote}

Seems like a good idea to me - implemented.

{quote}
On JIRA you suggested prioritizing anticompaction & index summary 
redistribution joint highest, but in the patch the latter has max priority (and 
is not overridable). Is that intentional?
{quote}

Index summary redistribution runs on its own 
{{org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor}}, so 
the priority is ignored. Added notes to reflect that. 

{quote}
Validation tasks are run a dedicated executor, which doesn't use a priority 
queue. This, and the fact that validation is orthogonal to compaction strategy 
means that all validation tasks are equal priority & also don't block the other 
tasks. So it makes the reasoning for explicity setting validation priority to 
256 a little unclear.
{quote}

Now set to max value with a note that it's on its own executor. 

{quote}
On a related note, CacheCleanupExecutor is also dedicated to a single task 
type. This does end up using a priority queue, but only ever has vanilla 
runnables submitted to it.
{quote}

Now runs with {{Cleanup}} priority (lowest). We don't really have an 
OperationType for {{CacheCleanupExecutor}}, we could add one to be explicit - 
perhaps this should really be same as key/row/counter cache save? I don't have 
a strong opinion on what it SHOULD be. 

{quote}
Some comments...
{quote}

... added. 


Also - I started with the idea that we could do much, much better with 
prioritizing compaction by getting very clever with subtype prioritization 
(compaction within the "normal" compaction priority). Thinking about starvation 
problems, and the 'right' way to generalize across compaction strategies makes 
me wonder if this is a good idea or not - in this patch, I've removed the 
remaining uses of subtype prioritization (where it was previously sorting tasks 
by bytes on disk), in favor of the current behavior (within a given type, tasks 
are ordered by timestamp). I've left the logic in place for later - if we 
eventually decide there's a "right" way to prioritize tasks within a type, we 
can add it back easily later. Some options that may be worth considering there 
are either hotness or a per-CF flag to allow user to prioritize certain CFs 
over others. In any case, I don't think it needs to be perfect - we have a huge 
improvement for operations by prioritizing 2i build and deprioritizing 
cleanup/scrub, so I think trying to get to perfect can wait.

> Prioritize Secondary Index rebuild
> ----------------------------------
>
>                 Key: CASSANDRA-11218
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11218
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: sankalp kohli
>            Assignee: Jeff Jirsa
>            Priority: Minor
>
> We have seen that secondary index rebuild get stuck behind other compaction 
> during a bootstrap and other operations. This causes things to not finish. We 
> should prioritize index rebuild via a separate thread pool or using a 
> priority queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11218) Prioritize Secondary Index rebuild

Reply via email to