[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561234#comment-14561234
 ] 

Dmitry Sivachenko commented on YARN-3066:
-----------------------------------------

Solaris can use the same ssid program (it is just a simple wrapper for setsid() 
syscall).
I just proposed a simplest fix for that problem.
JNI wrapper sounds like better approach.

What I want to see in any case is the loud error message in case setsid binary 
(or setsid() syscall if we go JNI way) is unavailable.  Right now it pretends 
to work and I spent some time digging out whats going wrong and why I see a lot 
of orphans.

> Hadoop leaves orphaned tasks running after job is killed
> --------------------------------------------------------
>
>                 Key: YARN-3066
>                 URL: https://issues.apache.org/jira/browse/YARN-3066
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>         Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
>            Reporter: Dmitry Sivachenko
>
> When spawning user task, node manager checks for setsid(1) utility and spawns 
> task program via it. See 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
>  for instance:
> String exec = Shell.isSetsidAvailable? "exec setsid" : "exec";
> FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain "exec" is 
> used to spawn user task.  If that task spawns other external programs (this 
> is common case if a task program is a shell script) and user kills job via 
> mapred job -kill <Job>, these child processes remain running.
> 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
> via exec: this is the guarantee to have orphaned processes when job is 
> prematurely killed.
> 2) FreeBSD has a replacement third-party program called ssid (which does 
> almost the same as Linux's setsid).  It would be nice to detect which binary 
> is present during configure stage and put @SETSID@ macros into java file to 
> use the correct name.
> I propose to make Shell.isSetsidAvailable test more strict and fail to start 
> if it is not found:  at least we will know about the problem at start rather 
> than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to