[jira] [Comment Edited] (YARN-4597) Add SCHEDULE to NM container lifecycle

Arun Suresh (JIRA) Mon, 07 Nov 2016 12:59:32 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15645284#comment-15645284
 ]


Arun Suresh edited comment on YARN-4597 at 11/7/16 8:58 PM:
------------------------------------------------------------

Rebased patch after YARN-2995 and latest commits to trunk.

[~jianhe], thanks for the comments..
bq. maybe rename ContainerScheduler#runningContainers to scheduledContainers
Given that the SCHEDULED state is a state that comes before RUNNING. and the 
*runningContainers* collection actually holds containers that have been marked 
to run by the scheduler, am thinking scheduled containers may not be apt here. 
How about *scheduledToRunContainers* ?

bq. The ContainerLaunch#killedBeforeStart flag, looks like the exising flag 
'shouldLaunchContainer' serves the same purpose, can we reuse that ? if so, the 
container#isMarkedToKill is also not needed.
Hmm.. my understanding is that it is slightly different. The 
*shouldLaunchContainer* is IIUC, used during the recovery process to signal 
that the container should not be launched. What *killedBeforeStart* aims to do 
is to notify the *ContainerLaunch* (which runs in a different thread) that the 
Scheduler, which had requested to start the container earlier, in the last 
moment changed its mind and decided to kill it. Using shouldLaunchContainer 
also causes the CONTAINER_LAUNCH event to be fired which I do not want.

bq. NodeManager#containerScheduler variable not used, remove
Done

bq. I think this comment is not addressed ? "In case we exceed the max-queue 
length, we are killing the container directly instead of queueing the 
container, in this case, we should not store the container as queued?"
Yeah.. meant to comment on it. This is actually the desired behavior. Once 
queue limit is reached, no new opportunistic containers should also be queued. 
The AM is free to request it again. The MRAppMaster, for eg. re-requests the 
same task as a GUARANTEED container.

Hope this made sense ?


was (Author: asuresh):
Rebased patch after YARN-2995 and latest commits to trunk.

[~jianhe], thanks for the comments..
bq. maybe rename ContainerScheduler#runningContainers to scheduledContainers
Given that we the SCHEDULED state is a state that comes before RUNNING. and the 
*runningContainers* collection actually holds containers that have been marked 
to run by the scheduler, am thinking scheduled containers may not be apt here. 
How about *scheduledToRunContainers* ?

bq. The ContainerLaunch#killedBeforeStart flag, looks like the exising flag 
'shouldLaunchContainer' serves the same purpose, can we reuse that ? if so, the 
container#isMarkedToKill is also not needed.
Hmm.. my understanding is that it is slightly different. The 
*shouldLaunchContainer* is IIUC, used during the recovery process to signal 
that the container should not be launched. What *killedBeforeStart* aims to do 
is to notify the *ContainerLaunch* (which runs in a different thread) that the 
Scheduler, which had requested to start the container earlier, in the last 
moment changed its mind and decided to kill it. Using shouldLaunchContainer 
also causes the CONTAINER_LAUNCH event to be fired which I do not want.

bq. NodeManager#containerScheduler variable not used, remove
Done

bq. I think this comment is not addressed ? "In case we exceed the max-queue 
length, we are killing the container directly instead of queueing the 
container, in this case, we should not store the container as queued?"
Yeah.. meant to comment on it. This is actually the desired behavior. Once 
queue limit is reached, no new opportunistic containers should also be queued. 
The AM is free to request it again. The MRAppMaster, for eg. re-requests the 
same task as a GUARANTEED container.

Hope this made sense ?

> Add SCHEDULE to NM container lifecycle
> --------------------------------------
>
>                 Key: YARN-4597
>                 URL: https://issues.apache.org/jira/browse/YARN-4597
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>            Reporter: Chris Douglas
>            Assignee: Arun Suresh
>              Labels: oct16-hard
>         Attachments: YARN-4597.001.patch, YARN-4597.002.patch, 
> YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, 
> YARN-4597.006.patch
>
>
> Currently, the NM immediately launches containers after resource 
> localization. Several features could be more cleanly implemented if the NM 
> included a separate stage for reserving resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-4597) Add SCHEDULE to NM container lifecycle

Reply via email to