[ https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Noble Paul updated SOLR-8744: ----------------------------- Priority: Blocker (was: Major) > Overseer operations need more fine grained mutual exclusion > ----------------------------------------------------------- > > Key: SOLR-8744 > URL: https://issues.apache.org/jira/browse/SOLR-8744 > Project: Solr > Issue Type: Improvement > Components: SolrCloud > Affects Versions: 5.4.1 > Reporter: Scott Blum > Assignee: Noble Paul > Priority: Blocker > Labels: sharding, solrcloud > Fix For: 6.1 > > Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, > SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, SmileyLockTree.java > > > SplitShard creates a mutex over the whole collection, but, in practice, this > is a big scaling problem. Multiple split shard operations could happen at > the time time, as long as different shards are being split. In practice, > those shards often reside on different machines, so there's no I/O bottleneck > in those cases, just the mutex in Overseer forcing the operations to be done > serially. > Given that a single split can take many minutes on a large collection, this > is a bottleneck at scale. > Here is the proposed new design > There are various Collection operations performed at Overseer. They may need > exclusive access at various levels. Each operation must define the Access > level at which the access is required. Access level is an enum. > CLUSTER(0) > COLLECTION(1) > SHARD(2) > REPLICA(3) > The Overseer node maintains a tree of these locks. The lock tree would look > as follows. The tree can be created lazily as and when tasks come up. > {code} > Legend: > C1, C2 -> Collections > S1, S2 -> Shards > R1,R2,R3,R4 -> Replicas > Cluster > / \ > / \ > C1 C2 > / \ / \ > / \ / \ > S1 S2 S1 S2 > R1, R2 R3.R4 R1,R2 R3,R4 > {code} > When the overseer receives a message, it tries to acquire the appropriate > lock from the tree. For example, if an operation needs a lock at a Collection > level and it needs to operate on Collection C1, the node C1 and all child > nodes of C1 must be free. > h2.Lock acquiring logic > Each operation would start from the root of the tree (Level 0 -> Cluster) and > start moving down depending upon the operation. After it reaches the right > node, it checks if all the children are free from a lock. If it fails to > acquire a lock, it remains in the work queue. A scheduler thread waits for > notification from the current set of tasks . Every task would do a > {{notify()}} on the monitor of the scheduler thread. The thread would start > from the head of the queue and check all tasks to see if that task is able to > acquire the right lock. If yes, it is executed, if not, the task is left in > the work queue. > When a new task arrives in the work queue, the schedulerthread wakes and just > try to schedule that task. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org