Wilfred Spiegelenburg created YARN-8436:
-------------------------------------------

             Summary: FSParentQueue: Comparison method violates its general 
contract
                 Key: YARN-8436
                 URL: https://issues.apache.org/jira/browse/YARN-8436
             Project: Hadoop YARN
          Issue Type: Bug
          Components: fairscheduler
    Affects Versions: 3.1.0
            Reporter: Wilfred Spiegelenburg


The ResourceManager can fail while sorting queues if an update comes in:
{code:java}
FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
handling event type NODE_UPDATE to the scheduler
java.lang.IllegalArgumentException: Comparison method violates its general 
contract!
        at java.util.TimSort.mergeLo(TimSort.java:777)
        at java.util.TimSort.mergeAt(TimSort.java:514)
...
        at java.util.Collections.sort(Collections.java:175)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:223){code}
The reason it breaks is a change in the sorted object itself. 
This is why it fails:
 * an update from a node comes in as a heartbeat.
 * the update triggers a check to see if we can assign a container on the node.
 * walk over the queue hierarchy to find a queue to assign a container to: top 
down.
 * for each parent queue we sort the child queues in {{assignContainer}} to 
decide which queue to descent into.
 * we lock the parent queue when sort to prevent changes, but we do not lock 
the child queues that we are sorting.

If during this sorting a different node update changes a child queue then we 
allow that. This means that the objects that we are trying to sort now might be 
out of order. That causes the issue with the comparator. The comparator itself 
is not broken.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Reply via email to