[ 
https://issues.apache.org/jira/browse/YARN-10250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094707#comment-17094707
 ] 

Matthew Sharp commented on YARN-10250:
--------------------------------------

The launch-container script will fail on any non-zero return code, since that 
is debugging information only, one quick approach is to force those commands to 
always return true so the container relaunch is not impacted. 

> Container Relaunch - find: File system loop detected
> ----------------------------------------------------
>
>                 Key: YARN-10250
>                 URL: https://issues.apache.org/jira/browse/YARN-10250
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 3.2.0
>            Reporter: Matthew Sharp
>            Priority: Major
>
> Hive LLAP YARN service tries to relaunch from a container failure and when it 
> retries on the same node we are seeing it fail with:
> {code:java}
> find: File system loop detected; ‘./lib/llap-27Apr2020.tar.gz’ is part of the 
> same file system loop as ‘./lib’. {code}
>  
> YARN-8667 attempted to clean up the prior symlinks before relaunching, but in 
> this case it still exists since it recreates the symlinks right before trying 
> to output to directory.info for logging.
>  
> The following line appears to be the culprit:  
> [https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java#L1346]
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to