[ https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17110990#comment-17110990 ]
shilongfei edited comment on YARN-10272 at 5/19/20, 9:08 AM: ------------------------------------------------------------- *The second time**, version:.3.1.0*** The initial phenomenon is the same as above, but this time jstack is not the same as before, This time jstack is as follows, The ContainersLauncher.runPreKillContainerScript() method is customized by us, it is to execute a script before the container exits for doing something (such as jstack to save the container), which uses the Shell to execute the script {code:java} "NM ContainerManager dispatcher" #193 prio=5 os_prio=0 tid=0x00007fa79b0cc800 nid=0x493c in Object.wait() [0x00007fa5a9ac5000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1252) - locked <0x00000000e65dc540> (a org.apache.hadoop.util.Shell$1) at java.lang.Thread.join(Thread.java:1326) at org.apache.hadoop.util.Shell.joinThread(Shell.java:1057) at org.apache.hadoop.util.Shell.runCommand(Shell.java:1037) at org.apache.hadoop.util.Shell.run(Shell.java:902) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.runPreKillContainerScript(ContainersLauncher.java:266) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:162) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:66) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:198) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) at java.lang.Thread.run(Thread.java:748) {code} Shell.joinThread () method joins the errThread, the errThread stuck on read error stream {code:java} "Thread-430" #768 prio=5 os_prio=0 tid=0x00007fa5541ef800 nid=0x57a7 runnable [0x00007fa39dcf8000] java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:255) at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) - locked <0x00000000eb7fe618> (a java.lang.UNIXProcess$ProcessPipeInputStream) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) - locked <0x00000000eb8cd168> (a java.io.InputStreamReader) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:161) at java.io.BufferedReader.readLine(BufferedReader.java:324) - locked <0x00000000eb8cd168> (a java.io.InputStreamReader) at java.io.BufferedReader.readLine(BufferedReader.java:389) at org.apache.hadoop.util.Shell$1.run(Shell.java:970){code} !image-2020-05-11-14-53-09-751.png! was (Author: shilongfei): *The second time* The initial phenomenon is the same as above, but this time jstack is not the same as before, This time jstack is as follows, The ContainersLauncher.runPreKillContainerScript() method is customized by us, it is to execute a script before the container exits for doing something (such as jstack to save the container), which uses the Shell to execute the script {code:java} "NM ContainerManager dispatcher" #193 prio=5 os_prio=0 tid=0x00007fa79b0cc800 nid=0x493c in Object.wait() [0x00007fa5a9ac5000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1252) - locked <0x00000000e65dc540> (a org.apache.hadoop.util.Shell$1) at java.lang.Thread.join(Thread.java:1326) at org.apache.hadoop.util.Shell.joinThread(Shell.java:1057) at org.apache.hadoop.util.Shell.runCommand(Shell.java:1037) at org.apache.hadoop.util.Shell.run(Shell.java:902) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.runPreKillContainerScript(ContainersLauncher.java:266) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:162) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:66) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:198) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) at java.lang.Thread.run(Thread.java:748) {code} Shell.joinThread () method joins the errThread, the errThread stuck on read error stream {code:java} "Thread-430" #768 prio=5 os_prio=0 tid=0x00007fa5541ef800 nid=0x57a7 runnable [0x00007fa39dcf8000] java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:255) at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) - locked <0x00000000eb7fe618> (a java.lang.UNIXProcess$ProcessPipeInputStream) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) - locked <0x00000000eb8cd168> (a java.io.InputStreamReader) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:161) at java.io.BufferedReader.readLine(BufferedReader.java:324) - locked <0x00000000eb8cd168> (a java.io.InputStreamReader) at java.io.BufferedReader.readLine(BufferedReader.java:389) at org.apache.hadoop.util.Shell$1.run(Shell.java:970){code} !image-2020-05-11-14-53-09-751.png! > Shell#runCommand() executes a shell script and gets stuck when reading stdout > and stderr > ---------------------------------------------------------------------------------------- > > Key: YARN-10272 > URL: https://issues.apache.org/jira/browse/YARN-10272 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 2.6.0, 3.1.0 > Reporter: shilongfei > Priority: Major > Attachments: image-2020-04-02-18-54-13-112.png, > image-2020-04-02-18-58-39-977.png, image-2020-04-02-19-00-01-387.png, > image-2020-05-11-14-53-09-751.png > > > When using Shell to execute a shell script, it occasionally gets stuck at > reading input, input and error streams. I have encountered this situation > three times, I will write the three situations in the comments. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org