[ https://issues.apache.org/jira/browse/YARN-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109918#comment-16109918 ]
Eric Badger commented on YARN-6846: ----------------------------------- bq. If I'm reading the man pages correctly for geteuid(), seteuid(), and readdir(), they don't generate ENOENT For {{geteuid()}} and {{seteuid()}}, these aren't the methods that are setting {{errno}} in the code change in the first block referenced (1837). {noformat} - if (rmdir(path) != 0) { + if (rmdir(path) != 0 && errno != ENOENT) { {noformat} {{rmdir(path)}} is what sets {{errno}} here and can return {{ENOENT}}. As far as {{readdir()}} goes, it looks like posix has it returning {{ENOENT}}, while Linux doesn't. I think it's better to go with Posix here, but I'll refer to [~jlowe] on that. http://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html http://man7.org/linux/man-pages/man3/readdir.3.html > Nodemanager can fail to fully delete application local directories when > applications are killed > ----------------------------------------------------------------------------------------------- > > Key: YARN-6846 > URL: https://issues.apache.org/jira/browse/YARN-6846 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 2.8.1 > Reporter: Jason Lowe > Assignee: Jason Lowe > Priority: Critical > Attachments: YARN-6846.001.patch, YARN-6846.002.patch, > YARN-6846.003.patch > > > When an application is killed all of the running containers are killed and > the app waits for the containers to complete before cleaning up. As each > container completes the container directory is deleted via the > DeletionService. After all containers have completed the app completes and > the app directory is deleted. If the app completes quickly enough then the > deletion of the container and app directories can race against each other. > If the container deletion executor deletes a file just before the application > deletion executor then it can cause the application deletion executor to > fail, leaving the remaining entries in the application directory lingering. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org