[ https://issues.apache.org/jira/browse/HADOOP-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812539#comment-16812539 ]
Jim Brennan commented on HADOOP-15372: -------------------------------------- [~miklos.szeg...@cloudera.com], [~ebadger], I recently debugged a case where we were (still) leaking tmp dirs for localized tarballs in our 2.8 code. The problem turned out to be not that we were failing to kill all the shells, but that we were only killing the first subshell in the tar command, which was: {{gzip -dc inFile | ( cd untarDir; tar -xf)}} When I went to attempt to reproduce the problem in 3.x (trunk), I was unable to get it to happen. I believe this was fixed by YARN-2185, which changed the localization code to use runCommandOnStream(). Because there are threads for the input/output of the shell command, it is killed when the threads are killed. So I think this Jira can be closed. Do you guys agree? > Race conditions and possible leaks in the Shell class > ----------------------------------------------------- > > Key: HADOOP-15372 > URL: https://issues.apache.org/jira/browse/HADOOP-15372 > Project: Hadoop Common > Issue Type: Bug > Affects Versions: 2.10.0, 3.2.0 > Reporter: Miklos Szegedi > Assignee: Eric Badger > Priority: Minor > Attachments: HADOOP-15372.001.patch > > > YARN-5641 introduced some cleanup code in the Shell class. It has a race > condition. {{Shell.runCommand()}} can be called while/after > {{Shell.getAllShells()}} returned all the shells to be cleaned up. This new > thread can avoid the clean up, so that the process held by it can be leaked > causing leaked localized files/etc. > I see another issue as well. {{Shell.runCommand()}} has a finally block with > a {{process.destroy();}} to clean up. However, the try catch block does not > cover all instructions after the process is started, so for example we can > exit the thread and leak the process, if > {{timeOutTimer.schedule(timeoutTimerTask, timeOutInterval);}} causes an > exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org