[jira] [Commented] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

Scott Blum (JIRA) Fri, 10 Jun 2016 10:30:48 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15324891#comment-15324891
 ]


Scott Blum commented on SOLR-8744:
----------------------------------

LG.  I only have one suggestion left, to formulate the "fetch" section like 
this:

{code}
          ArrayList<QueueEvent> heads = new ArrayList<>(blockedTasks.size() + 
MAX_PARALLEL_TASKS);
          heads.addAll(blockedTasks.values());

          // If we have enough items in the blocked tasks already, it makes
          // no sense to read more items from the work queue. it makes sense
          // to clear out at least a few items in the queue before we read more 
items
          if (heads.size() < MAX_BLOCKED_TASKS) {
            //instead of reading MAX_PARALLEL_TASKS items always, we should 
only fetch as much as we can execute
            int toFetch = Math.min(MAX_BLOCKED_TASKS - heads.size(), 
MAX_PARALLEL_TASKS - runningTasks.size());
            List<QueueEvent> newTasks = workQueue.peekTopN(toFetch, 
excludedTasks, 2000L);
            log.debug("Got {} tasks from work-queue : [{}]", newTasks.size(), 
newTasks.toString());
            heads.addAll(newTasks);
          } else {
            Thread.sleep(1000);
          }

          if (isClosed) break;

          if (heads.isEmpty()) {
            continue;
          }

          blockedTasks.clear(); // clear it now; may get refilled below.
{code}

This prevents two problems:

1) The log.debug message "Got {} tasks from work-queue" won't keep reporting 
blockedTasks as if they were freshly fetched.
2) When the blockedTask map gets completely full, the Thread.sleep() will 
prevent a free-spin (the only other real pause in the loop is peekTopN)

> Overseer operations need more fine grained mutual exclusion
> -----------------------------------------------------------
>
>                 Key: SOLR-8744
>                 URL: https://issues.apache.org/jira/browse/SOLR-8744
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>    Affects Versions: 5.4.1
>            Reporter: Scott Blum
>            Assignee: Noble Paul
>            Priority: Blocker
>              Labels: sharding, solrcloud
>             Fix For: 6.1
>
>         Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>                  Cluster
>                 /       \
>                /         \         
>               C1          C2
>              / \         /   \     
>             /   \       /     \      
>            S1   S2      S1     S2
>         R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

Reply via email to