[ https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896610#comment-16896610 ]
Eric Yang commented on YARN-9562: --------------------------------- [~ebadger] I couldn't get very far with patch 002 with YARN-9561 patch 002. I have configured the following configs: {code} <property> <name>yarn.nodemanager.runtime.linux.allowed-runtimes</name> <value>default,docker,runc</value> <description> Comma separated list of runtimes that are allowed when using LinuxContainerExecutor. The allowed values are default, docker, and javasandbox. </description> </property> <property> <name>yarn.nodemanager.runtime.linux.runc.image-tag-to-manifest-plugin.local-hash-file</name> <value>/tmp/centos</value> </property> {code} Node manager refused to start with this error: {code} 2019-07-30 23:04:56,998 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: IOException executing command: java.io.InterruptedIOException: java.lang.InterruptedException at org.apache.hadoop.util.Shell.runCommand(Shell.java:1011) at org.apache.hadoop.util.Shell.run(Shell.java:901) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1213) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:154) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.RuncContainerRuntime$1.run(RuncContainerRuntime.java:264) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.InterruptedException at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395) at org.apache.hadoop.util.Shell.runCommand(Shell.java:1001) ... 11 more 2019-07-30 23:04:56,999 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.RuncContainerRuntime: Failed to reap old runc layer mounts org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: java.io.InterruptedIOException: java.lang.InterruptedException at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:185) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.RuncContainerRuntime$1.run(RuncContainerRuntime.java:264) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.InterruptedIOException: java.lang.InterruptedException at org.apache.hadoop.util.Shell.runCommand(Shell.java:1011) at org.apache.hadoop.util.Shell.run(Shell.java:901) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1213) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:154) ... 8 more Caused by: java.lang.InterruptedException at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395) at org.apache.hadoop.util.Shell.runCommand(Shell.java:1001) ... 11 more {code} It looks like it tries to clean up existing runc mounts but failed to find any and crashed. # What are the minimum configs required for this feature to work? # What does the Image Tag to Hash file look like? > Add Java changes for the new RuncContainerRuntime > ------------------------------------------------- > > Key: YARN-9562 > URL: https://issues.apache.org/jira/browse/YARN-9562 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Eric Badger > Assignee: Eric Badger > Priority: Major > Attachments: YARN-9562.001.patch, YARN-9562.002.patch > > > This JIRA will be used to add the Java changes for the new > RuncContainerRuntime. This will work off of YARN-9560 to use much of the > existing DockerLinuxContainerRuntime code once it is moved up into an > abstract class that can be extended. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org