[
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298756#comment-15298756
]
David Smiley commented on SOLR-8744:
------------------------------------
bq. (Noble) David , It works. But it is not yet complete. But, I miss the
point. What are we trying to solve? Is the current implementation buggy? We
just need one correct implementation.
I am merely offering an alternative implementation to part of the task that I
feel is more simple / elegant (as I stated). In part I'm doing this because
it's fun/interesting :-) Other than complexity of it and related code already
in OverseerTaskProcessor, I'm not sure what bugs may or may not be in your
patch or the existing code.
bq. (Scott) Have you run your impl against the test Noble Paul wrote? Curious
if it passes.
No; I'd like to do that. I suggest [~noble.paul] commit to a branch, push, and
I commit on top (assuming tests pass). Of course it will be reverted if you
all don't like it.
bq. (Scott) LayeredLock.tryLock() doesn't really implement markBusy() because
it doesn't retain readLocks on the parent chain if you fail to lock the child.
To implement markBusy you'd need to leave them locked (and have a way to later
completely reset all the read locks).
Thanks for clarifying on IRC why the Overseer would want to use tryLock()
instead of lock() -- something that confused me from your question/statement.
I _had thought_ OverseerTaskProcessor was going to spawn off a task/thread
(more likely Executor impl) that would simply call lock() in its own thread and
it may wait as long as needed to acquire the lock and to do its task. I am
quite ready to admit I don't know what's actually going on here ;-P which is
apparently the case. Let me ask you this then... why *doesn't* it work in this
manner? It sounds simple enough. I do see an executor (tpe field), but
there's so much surrounding code that it is confusing my ability to grok it and
thus why tryLock vs lock might be wanted. In such a design (in my simple
conceptual model), fairness=true should be on the ReentrantReadWriteLocks in
SmileyLockTree to prevent starvation to get FIFO behavior on the write lock.
If it's too complicated to explain, maybe let me have at it on a branch. And
again, I may very well not appreciate (yet) why OverseerTaskProcessor with it's
proposed LockTree needs to be so complicated. So please don't take any of my
questions or suggestions as a slight.
> Overseer operations need more fine grained mutual exclusion
> -----------------------------------------------------------
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
> Issue Type: Improvement
> Components: SolrCloud
> Affects Versions: 5.4.1
> Reporter: Scott Blum
> Assignee: Noble Paul
> Labels: sharding, solrcloud
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java,
> SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this
> is a big scaling problem. Multiple split shard operations could happen at
> the time time, as long as different shards are being split. In practice,
> those shards often reside on different machines, so there's no I/O bottleneck
> in those cases, just the mutex in Overseer forcing the operations to be done
> serially.
> Given that a single split can take many minutes on a large collection, this
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need
> exclusive access at various levels. Each operation must define the Access
> level at which the access is required. Access level is an enum.
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend:
> C1, C2 -> Collections
> S1, S2 -> Shards
> R1,R2,R3,R4 -> Replicas
> Cluster
> / \
> / \
> C1 C2
> / \ / \
> / \ / \
> S1 S2 S1 S2
> R1, R2 R3.R4 R1,R2 R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate
> lock from the tree. For example, if an operation needs a lock at a Collection
> level and it needs to operate on Collection C1, the node C1 and all child
> nodes of C1 must be free.
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and
> start moving down depending upon the operation. After it reaches the right
> node, it checks if all the children are free from a lock. If it fails to
> acquire a lock, it remains in the work queue. A scheduler thread waits for
> notification from the current set of tasks . Every task would do a
> {{notify()}} on the monitor of the scheduler thread. The thread would start
> from the head of the queue and check all tasks to see if that task is able to
> acquire the right lock. If yes, it is executed, if not, the task is left in
> the work queue.
> When a new task arrives in the work queue, the schedulerthread wakes and just
> try to schedule that task.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]