Ferenc Erdelyi created YARN-11709:
-------------------------------------

             Summary: NodeManager should be shut down or blacklisted when it 
cannot run program "/var/lib/yarn-ce/bin/container-executor"
                 Key: YARN-11709
                 URL: https://issues.apache.org/jira/browse/YARN-11709
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: container-executor
            Reporter: Ferenc Erdelyi


When NodeManager encounters the below "No such file or directory" error 
reported against the "container-executor", it should give up participating in 
the cluster as it is not capable to run any container, but just fail the jobs.


{code:java}
2023-01-18 10:08:10,600 WARN 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exit code 
from container container_e159_1673543180101_9407_02_
000014 startLocalizer is : -1
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
 java.io.IOException: Cannot run program 
"/var/lib/yarn-ce/bin/container-executor": error=2, No such file or directory
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:183)
at 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:403)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.j
ava:1250)
Caused by: java.io.IOException: Cannot run program 
"/var/lib/yarn-ce/bin/container-executor": error=2, No such file or directory
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Reply via email to