[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-10 Thread Scott Blum (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Blum updated SOLR-8744:
-
Attachment: SOLR-8744.patch

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-10 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: SOLR-8744.patch

incorporated the changes. Beasted the test and no failures

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-09 Thread Scott Blum (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Blum updated SOLR-8744:
-
Attachment: SOLR-8744.patch

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-09 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: (was: SOLR-8744.patch)

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, 
> SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-09 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: SOLR-8744.patch

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, 
> SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-09 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: SOLR-8744.patch

[~dragonsinth] This takes care of your concerns

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, 
> SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-08 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Priority: Blocker  (was: Major)

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-08 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Fix Version/s: 6.1

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-07 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: SOLR-8744.patch

This keeps a local copy of clocked tasks and uses that in the excluded tasks 
check

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>  Labels: sharding, solrcloud
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-05-25 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: SOLR-8744.patch

I missed those changes while merging between branches. Added back in this 
patch. I'll do another review to see what else is lost

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>  Labels: sharding, solrcloud
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-05-24 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: (was: SOLR-8744.patch)

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>  Labels: sharding, solrcloud
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-05-24 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: SOLR-8744.patch

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>  Labels: sharding, solrcloud
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-05-24 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: SOLR-8744.patch

Implements your suggestions [~dragonsinth]

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>  Labels: sharding, solrcloud
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-05-24 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-8744:
---
Attachment: SmileyLockTree.java

Thanks for the review [~noble.paul].  I attached a new version that uses 
List instead of '/' separated strings.  It's even fewer lines of code 
:-)

bq. It's not enough to ensure that your parent is not locked. It is also 
required that none of your children are locked too. For example, C1 cannot be 
locked if C1/S1/R1 or C1/S2/R1 is locked

Did you see the javadocs?  I could have added more, and I added a little more.  
Or maybe you are unfamiliar with how ReentrantReadWriteLock works?  See, there 
are two locks at each node in the tree -- a read lock and a write lock, 
courtesy of ReentrantReadWriteLock which does most of the work here.  Multiple 
read locks can be open for a ReentrantReadWriteLock but the write lock is 
mutually exclusive to them all.  The Lock that SmileyLockTree returns at a 
given "node" will first obtain a read lock at all ancestors (top down) and then 
obtain a write lock on the node for itself.  If a Lock for {{C1/S1/R1}} is 
obtained, it will be a LayeredLock with "thisLock" being the write lock for 
itself (R1) and the "parentLock" is a chain of LeveledLock _of read locks_ to 
its ancestors.  The origin of the chain (i.e. if you keep following 
LevelledLock parentLock instances) is the read lock on the root, which is a 
direct instance of ReentrantReadWriteLock.readLock whereas all the rest are the 
anonymous inner class ReadWriteLock that returns a LayeredLock for it's read & 
write sides.  So when the caller calls lock(), the first concrete Lock impl to 
be locked with be the read Lock on root (empty string), then the read lock on 
{{C1}} then the read lock on {{C1/S1}} then the _write lock_ on {{C1/S1/R1}}.  
For the duration of time this is locked, any siblings of R1 or R1's ancestors 
may be locked, or they could have been locked before.  What would block would 
be a an ancestor -- {{C1}} in your example would block because the write lock 
blocks for read locks, and the read block is locked by not only {{C1/S1/R1}} 
but also {{C1/S2/R1}} in your example.

It's a little functional which maybe you aren't used to, and it leverages 
useful abstractions and implementations from the JDK that maybe you are 
unfamiliar with, so perhaps that's why it's not clear how this works.  Do you 
grok this [~dragonsinth]?

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>  Labels: sharding, solrcloud
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, 
> SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work qu

[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-05-24 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: SOLR-8744.patch

The locktree is reset every time the {{runningTasks}} is empty. So there is no 
chance of a blow up. 

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>  Labels: sharding, solrcloud
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-05-23 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-8744:
---
Attachment: SmileyLockTree.java

Here's SmileyLockTree.java (name is obviously just to not confuse it with your 
LockTree).  I ditched the weak values idea as I had doubts on the added 
complexity.  I changed the impl a bit from what I had said to make it feel more 
elegant to me, and I implemented most of JDK Lock for fun.  After a short 
discussion with [~dragonsinth] on IRC on the context of this task, I believe 
there's no issue of starvation (for SmileyLockTree specifically any way) since 
the Overseer runs sequentially so I didn't add the fairness=true option.

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>  Labels: sharding, solrcloud
> Attachments: SOLR-8744.patch, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-05-22 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: (was: SOLR-8744.patch)

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>  Labels: sharding, solrcloud
> Attachments: SOLR-8744.patch
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-05-22 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: SOLR-8744.patch

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>  Labels: sharding, solrcloud
> Attachments: SOLR-8744.patch
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-05-22 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: (was: SOLR-8744.patch)

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>  Labels: sharding, solrcloud
> Attachments: SOLR-8744.patch
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-05-22 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: SOLR-8744.patch

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>  Labels: sharding, solrcloud
> Attachments: SOLR-8744.patch, SOLR-8744.patch
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-05-22 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: SOLR-8744.patch

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>  Labels: sharding, solrcloud
> Attachments: SOLR-8744.patch
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-05-11 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Summary: Overseer operations need more fine grained mutual exclusion  (was: 
SplitShard needs finer-grained mutual exclusion)

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>  Labels: sharding, solrcloud
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org