[jira] [Updated] (YARN-897) CapacityScheduler wrongly sorted queues

2013-07-19 Thread Djellel Eddine Difallah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Djellel Eddine Difallah updated YARN-897:
-

Attachment: YARN-897-3.patch

A new version of the patch with completedContainer taking the CSQueue to 
reinsert as suggested by [~acmurthy].
Also the patch now contains the unit test for testing proper childQueues sort 
order.

 CapacityScheduler wrongly sorted queues
 ---

 Key: YARN-897
 URL: https://issues.apache.org/jira/browse/YARN-897
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.0.4-alpha
Reporter: Djellel Eddine Difallah
Priority: Blocker
 Attachments: TestBugParentQueue.java, YARN-897-1.patch, 
 YARN-897-2.patch, YARN-897-3.patch


 The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity 
 defines the sort order. This ensures the queue with least UsedCapacity to 
 receive resources next. On containerAssignment we correctly update the order, 
 but we miss to do so on container completions. This corrupts the TreeSet 
 structure, and under-capacity queues might starve for resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-897) CapacityScheduler wrongly sorted queues

2013-07-19 Thread Djellel Eddine Difallah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Djellel Eddine Difallah updated YARN-897:
-

Attachment: YARN-897-4.patch

[~acmurthy] Unfortunately, as I pointed above with Omkar, we have to iterate 
because at that point in time the chilQueues are already out of order and we 
can't use TreeSet methods. For the same reason assignContainersToChildQueues 
iterate then add/remove too.
This patch moves the code of reinsertQueue inline with completedContainer. 

 CapacityScheduler wrongly sorted queues
 ---

 Key: YARN-897
 URL: https://issues.apache.org/jira/browse/YARN-897
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.0.4-alpha
Reporter: Djellel Eddine Difallah
Priority: Blocker
 Attachments: TestBugParentQueue.java, YARN-897-1.patch, 
 YARN-897-2.patch, YARN-897-3.patch, YARN-897-4.patch


 The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity 
 defines the sort order. This ensures the queue with least UsedCapacity to 
 receive resources next. On containerAssignment we correctly update the order, 
 but we miss to do so on container completions. This corrupts the TreeSet 
 structure, and under-capacity queues might starve for resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-897) CapacityScheduler wrongly sorted queues

2013-07-11 Thread Djellel Eddine Difallah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Djellel Eddine Difallah updated YARN-897:
-

Attachment: YARN-897-1.patch

Attached is a first patch attempt to address the bug:
Upon container completion, which triggers completedContainer(), remove and 
reinsert the queue into its parent's childQueues. This operation is done 
recursively starting from the leafQueue where the container got released. 
Thus, by handling both cases where usedCapacity is ever changed (assignement 
and completion) the TreeSet remains properly sorted.

 CapacityScheduler wrongly sorted queues
 ---

 Key: YARN-897
 URL: https://issues.apache.org/jira/browse/YARN-897
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Reporter: Djellel Eddine Difallah
 Attachments: TestBugParentQueue.java, YARN-897-1.patch


 The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity 
 defines the sort order. This ensures the queue with least UsedCapacity to 
 receive resources next. On containerAssignment we correctly update the order, 
 but we miss to do so on container completions. This corrupts the TreeSet 
 structure, and under-capacity queues might starve for resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-897) CapacityScheduler wrongly sorted queues

2013-07-11 Thread Djellel Eddine Difallah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13706348#comment-13706348
 ] 

Djellel Eddine Difallah commented on YARN-897:
--

Omkar, thanks for the feedback
{quote}any reason for this even after this patch? if we don't see any other 
issues then why not just use childQueues.remove instead of iterating?{quote}
The tree is already out of order because of the new usedCapacity, the remove() 
won't work. We have to iterate and add() to fix the order.
{quote}reinsertQueue could be marked synchronized? thoughts? But yeah.. without 
that too it is thread safe as we are locking it at 
CapacitySchedulder.nodeUpdate(). but still it is better to mark it.{quote}
ok, sounds reasonable to put a synchronize there.
{quote}LOG.info(Re-sorting queues since queue got completed:  + 
childQueue.getQueuePath() +
nit. line  80{quote}
sure
{quote}at present we send the container completed event to leaf queue and then 
keep propagating it till root. why not sent the event to root grab the locks 
from root-leaf and update it? any thoughts?{quote}
Because the released container is linked to a leaf queue and we have to walk 
bottom up to figure out to which parent propagate. The assignment phase, 
however, works the way you described.

 CapacityScheduler wrongly sorted queues
 ---

 Key: YARN-897
 URL: https://issues.apache.org/jira/browse/YARN-897
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Reporter: Djellel Eddine Difallah
 Attachments: TestBugParentQueue.java, YARN-897-1.patch


 The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity 
 defines the sort order. This ensures the queue with least UsedCapacity to 
 receive resources next. On containerAssignment we correctly update the order, 
 but we miss to do so on container completions. This corrupts the TreeSet 
 structure, and under-capacity queues might starve for resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-897) CapacityScheduler wrongly sorted queues

2013-07-11 Thread Djellel Eddine Difallah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Djellel Eddine Difallah updated YARN-897:
-

Attachment: YARN-897-2.patch

Patch reflecting Omkar's comments. 1) add synchronized to reinsertQueue 2) 
reduce line length

 CapacityScheduler wrongly sorted queues
 ---

 Key: YARN-897
 URL: https://issues.apache.org/jira/browse/YARN-897
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Reporter: Djellel Eddine Difallah
 Attachments: TestBugParentQueue.java, YARN-897-1.patch, 
 YARN-897-2.patch


 The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity 
 defines the sort order. This ensures the queue with least UsedCapacity to 
 receive resources next. On containerAssignment we correctly update the order, 
 but we miss to do so on container completions. This corrupts the TreeSet 
 structure, and under-capacity queues might starve for resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-897) CapacityScheduler wrongly sorted queues

2013-07-11 Thread Djellel Eddine Difallah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Djellel Eddine Difallah updated YARN-897:
-

Attachment: YARN-897-1.patch

 CapacityScheduler wrongly sorted queues
 ---

 Key: YARN-897
 URL: https://issues.apache.org/jira/browse/YARN-897
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Reporter: Djellel Eddine Difallah
 Attachments: TestBugParentQueue.java, YARN-897-1.patch, 
 YARN-897-2.patch


 The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity 
 defines the sort order. This ensures the queue with least UsedCapacity to 
 receive resources next. On containerAssignment we correctly update the order, 
 but we miss to do so on container completions. This corrupts the TreeSet 
 structure, and under-capacity queues might starve for resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-897) CapacityScheduler wrongly sorted queues

2013-07-03 Thread Djellel Eddine Difallah (JIRA)
Djellel Eddine Difallah created YARN-897:


 Summary: CapacityScheduler wrongly sorted queues
 Key: YARN-897
 URL: https://issues.apache.org/jira/browse/YARN-897
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Reporter: Djellel Eddine Difallah


The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity 
defines the sort order. This ensures the queue with least UsedCapacity to 
receive resources next. On containerAssignment we correctly update the order, 
but we miss to do so on container completions. This corrupts the TreeSet 
structure, and under-capacity queues might starve for resources.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-897) CapacityScheduler wrongly sorted queues

2013-07-03 Thread Djellel Eddine Difallah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Djellel Eddine Difallah updated YARN-897:
-

Attachment: TestBugParentQueue.java

Simple JUnit Test that triggers the bug.

 CapacityScheduler wrongly sorted queues
 ---

 Key: YARN-897
 URL: https://issues.apache.org/jira/browse/YARN-897
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Reporter: Djellel Eddine Difallah
 Attachments: TestBugParentQueue.java


 The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity 
 defines the sort order. This ensures the queue with least UsedCapacity to 
 receive resources next. On containerAssignment we correctly update the order, 
 but we miss to do so on container completions. This corrupts the TreeSet 
 structure, and under-capacity queues might starve for resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-897) CapacityScheduler wrongly sorted queues

2013-07-03 Thread Djellel Eddine Difallah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699373#comment-13699373
 ] 

Djellel Eddine Difallah commented on YARN-897:
--

We spotted this bug while experimenting on dynamic queues updates. The TreeSet 
methods .contains() and .remove() failed on retrieving a queue that we knew was 
there, and that gave us a hint that the tree was unsorted properly.
The attached test is a [simple junit test | 
https://issues.apache.org/jira/secure/attachment/12590676/TestBugParentQueue.java]
 inspired by the already available capacity scheduler tests. It does simulate 
the scenario that [~curino] describes above and displays the order in which the 
childQueues is left after a couple of container assignments and completions.
I will post a first version of a patch that re-inserts the recently completed 
container's queue (and all its parents) into their respective parents' 
childQueues. 

 CapacityScheduler wrongly sorted queues
 ---

 Key: YARN-897
 URL: https://issues.apache.org/jira/browse/YARN-897
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Reporter: Djellel Eddine Difallah
 Attachments: TestBugParentQueue.java


 The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity 
 defines the sort order. This ensures the queue with least UsedCapacity to 
 receive resources next. On containerAssignment we correctly update the order, 
 but we miss to do so on container completions. This corrupts the TreeSet 
 structure, and under-capacity queues might starve for resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira