[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109834#comment-15109834 ]
Mark Miller commented on SOLR-5681: ----------------------------------- Scratch that, off in the weeds. > Make the OverseerCollectionProcessor multi-threaded > --------------------------------------------------- > > Key: SOLR-5681 > URL: https://issues.apache.org/jira/browse/SOLR-5681 > Project: Solr > Issue Type: Improvement > Components: SolrCloud > Reporter: Anshum Gupta > Assignee: Anshum Gupta > Fix For: 4.9, Trunk > > Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, > SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, > SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, > SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681.patch, > SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, > SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, > SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, > SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, > SOLR-5681.patch, SOLR-5681.patch > > > Right now, the OverseerCollectionProcessor is single threaded i.e submitting > anything long running would have it block processing of other mutually > exclusive tasks. > When OCP tasks become optionally async (SOLR-5477), it'd be good to have > truly non-blocking behavior by multi-threading the OCP itself. > For example, a ShardSplit call on Collection1 would block the thread and > thereby, not processing a create collection task (which would stay queued in > zk) though both the tasks are mutually exclusive. > Here are a few of the challenges: > * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An > easy way to handle that is to only let 1 task per collection run at a time. > * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. > The task from the workQueue is only removed on completion so that in case of > a failure, the new Overseer can re-consume the same task and retry. A queue > is not the right data structure in the first place to look ahead i.e. get the > 2nd task from the queue when the 1st one is in process. Also, deleting tasks > which are not at the head of a queue is not really an 'intuitive' thing. > Proposed solutions for task management: > * Task funnel and peekAfter(): The parent thread is responsible for getting > and passing the request to a new thread (or one from the pool). The parent > method uses a peekAfter(last element) instead of a peek(). The peekAfter > returns the task after the 'last element'. Maintain this request information > and use it for deleting/cleaning up the workQueue. > * Another (almost duplicate) queue: While offering tasks to workQueue, also > offer them to a new queue (call it volatileWorkQueue?). The difference is, as > soon as a task from this is picked up for processing by the thread, it's > removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org