[ https://issues.apache.org/jira/browse/YARN-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887471#comment-13887471 ]
Sunil G commented on YARN-1662: ------------------------------- A timed reservation logic if we can implement here, then it will be safer for the fresh allocation to try in some other node. I have reviewd the scheduler part and found that without a seperate timer thread, this can be achieved. addReReservation() will be invoked when the same node tries to rereserve the same applications requests in the node. This is a multiset, hence the internal count will increment everytime when this addReReservation() is performed. Also this will be incremented in every 1 sec(node heartbeat interval) only. I wish to add a code like below in LeafQueue::assignContainer() method. If the limit exceeds, i will try unreseve the same from the node. This code will hit when the same application trying to re-reserve again in same node. } else { // Reserve by 'charging' in advance... reserve(application, priority, node, rmContainer, container); // Check for re-reservation limit. In this case, unreserve and try for a // fresh allocation. if (RESERVATION_TIME_LIMIT != 0 && application.getReReservations(priority) > RESERVATION_TIME_LIMIT) { unreserve(application, priority, node, rmContainer); return Resources.none(); } So for the next nodeupdate from some other node, CS can try allocate resource to this application. NB: Reservation is to ensure that same task can stick on to same node where its better to run. A bigger configurable limit which is based on the nature of the tasks running, can still achieve the above behavior. Please share your thoughts. > Capacity Scheduler reservation issue cause Job Hang > --------------------------------------------------- > > Key: YARN-1662 > URL: https://issues.apache.org/jira/browse/YARN-1662 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 2.2.0 > Environment: Suse 11 SP1 + Linux > Reporter: Sunil G > > There are 2 node managers in my cluster. > NM1 with 8GB > NM2 with 8GB > I am submitting a Job with below details: > AM with 2GB > Map needs 5GB > Reducer needs 3GB > slowstart is enabled with 0.5 > 10maps and 50reducers are assigned. > 5maps are completed. Now few reducers got scheduled. > Now NM1 has 2GB AM and 3Gb Reducer_1 [Used 5GB] > NM2 has 3Gb Reducer_2 [Used 3GB] > A Map has now reserved(5GB) in NM1 which has only 3Gb free. > It hangs forever. > Potential issue is, reservation is now blocked in NM1 for a Map which needs > 5GB. > But the Reducer_1 hangs by waiting for few map ouputs. > Reducer side preemption also not happened as few headroom is still available. -- This message was sent by Atlassian JIRA (v6.1.5#6160)