[ https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298756#comment-15298756 ]
David Smiley commented on SOLR-8744: ------------------------------------ bq. (Noble) David , It works. But it is not yet complete. But, I miss the point. What are we trying to solve? Is the current implementation buggy? We just need one correct implementation. I am merely offering an alternative implementation to part of the task that I feel is more simple / elegant (as I stated). In part I'm doing this because it's fun/interesting :-) Other than complexity of it and related code already in OverseerTaskProcessor, I'm not sure what bugs may or may not be in your patch or the existing code. bq. (Scott) Have you run your impl against the test Noble Paul wrote? Curious if it passes. No; I'd like to do that. I suggest [~noble.paul] commit to a branch, push, and I commit on top (assuming tests pass). Of course it will be reverted if you all don't like it. bq. (Scott) LayeredLock.tryLock() doesn't really implement markBusy() because it doesn't retain readLocks on the parent chain if you fail to lock the child. To implement markBusy you'd need to leave them locked (and have a way to later completely reset all the read locks). Thanks for clarifying on IRC why the Overseer would want to use tryLock() instead of lock() -- something that confused me from your question/statement. I _had thought_ OverseerTaskProcessor was going to spawn off a task/thread (more likely Executor impl) that would simply call lock() in its own thread and it may wait as long as needed to acquire the lock and to do its task. I am quite ready to admit I don't know what's actually going on here ;-P which is apparently the case. Let me ask you this then... why *doesn't* it work in this manner? It sounds simple enough. I do see an executor (tpe field), but there's so much surrounding code that it is confusing my ability to grok it and thus why tryLock vs lock might be wanted. In such a design (in my simple conceptual model), fairness=true should be on the ReentrantReadWriteLocks in SmileyLockTree to prevent starvation to get FIFO behavior on the write lock. If it's too complicated to explain, maybe let me have at it on a branch. And again, I may very well not appreciate (yet) why OverseerTaskProcessor with it's proposed LockTree needs to be so complicated. So please don't take any of my questions or suggestions as a slight. > Overseer operations need more fine grained mutual exclusion > ----------------------------------------------------------- > > Key: SOLR-8744 > URL: https://issues.apache.org/jira/browse/SOLR-8744 > Project: Solr > Issue Type: Improvement > Components: SolrCloud > Affects Versions: 5.4.1 > Reporter: Scott Blum > Assignee: Noble Paul > Labels: sharding, solrcloud > Attachments: SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, > SmileyLockTree.java > > > SplitShard creates a mutex over the whole collection, but, in practice, this > is a big scaling problem. Multiple split shard operations could happen at > the time time, as long as different shards are being split. In practice, > those shards often reside on different machines, so there's no I/O bottleneck > in those cases, just the mutex in Overseer forcing the operations to be done > serially. > Given that a single split can take many minutes on a large collection, this > is a bottleneck at scale. > Here is the proposed new design > There are various Collection operations performed at Overseer. They may need > exclusive access at various levels. Each operation must define the Access > level at which the access is required. Access level is an enum. > CLUSTER(0) > COLLECTION(1) > SHARD(2) > REPLICA(3) > The Overseer node maintains a tree of these locks. The lock tree would look > as follows. The tree can be created lazily as and when tasks come up. > {code} > Legend: > C1, C2 -> Collections > S1, S2 -> Shards > R1,R2,R3,R4 -> Replicas > Cluster > / \ > / \ > C1 C2 > / \ / \ > / \ / \ > S1 S2 S1 S2 > R1, R2 R3.R4 R1,R2 R3,R4 > {code} > When the overseer receives a message, it tries to acquire the appropriate > lock from the tree. For example, if an operation needs a lock at a Collection > level and it needs to operate on Collection C1, the node C1 and all child > nodes of C1 must be free. > h2.Lock acquiring logic > Each operation would start from the root of the tree (Level 0 -> Cluster) and > start moving down depending upon the operation. After it reaches the right > node, it checks if all the children are free from a lock. If it fails to > acquire a lock, it remains in the work queue. A scheduler thread waits for > notification from the current set of tasks . Every task would do a > {{notify()}} on the monitor of the scheduler thread. The thread would start > from the head of the queue and check all tasks to see if that task is able to > acquire the right lock. If yes, it is executed, if not, the task is left in > the work queue. > When a new task arrives in the work queue, the schedulerthread wakes and just > try to schedule that task. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org