[ https://issues.apache.org/jira/browse/HADOOP-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jason Lowe moved YARN-4467 to HADOOP-13770: ------------------------------------------- Target Version/s: 2.8.0 (was: 2.8.0) Component/s: (was: nodemanager) util Key: HADOOP-13770 (was: YARN-4467) Project: Hadoop Common (was: Hadoop YARN) > Shell.checkIsBashSupported swallowed an interrupted exception > ------------------------------------------------------------- > > Key: HADOOP-13770 > URL: https://issues.apache.org/jira/browse/HADOOP-13770 > Project: Hadoop Common > Issue Type: Bug > Components: util > Reporter: Wei-Chiu Chuang > Assignee: Wei-Chiu Chuang > Priority: Blocker > Labels: oct16-easy, shell, supportability > Attachments: HADOOP-12652.001.patch, YARN-4467.001.patch > > > Edit: move this JIRA from HADOOP to YARN, as Shell.checkIsBashSupported() is > used, and only used in YARN. > Shell.checkIsBashSupported() creates a bash shell command to verify if the > system supports bash. However, its error message is misleading, and the logic > should be updated. > If the shell command throws an IOException, it does not imply the bash did > not run successfully. If the shell command process was interrupted, its > internal logic throws an InterruptedIOException, which is a subclass of > IOException. > {code:title=Shell.checkIsBashSupported|borderStyle=solid} > ShellCommandExecutor shexec; > boolean supported = true; > try { > String[] args = {"bash", "-c", "echo 1000"}; > shexec = new ShellCommandExecutor(args); > shexec.execute(); > } catch (IOException ioe) { > LOG.warn("Bash is not supported by the OS", ioe); > supported = false; > } > {code} > An example of it appeared in a recent jenkins job > https://builds.apache.org/job/PreCommit-HADOOP-Build/8257/testReport/org.apache.hadoop.ipc/TestRPCWaitForProxy/testInterruptedWaitForProxy/ > The test logic in TestRPCWaitForProxy.testInterruptedWaitForProxy starts a > thread, wait it for 1 second, and interrupt the thread, expecting the thread > to terminate. However, the method Shell.checkIsBashSupported swallowed the > interrupt, and therefore failed. > {noformat} > 2015-12-16 21:31:53,797 WARN util.Shell > (Shell.java:checkIsBashSupported(718)) - Bash is not supported by the OS > java.io.InterruptedIOException: java.lang.InterruptedException > at org.apache.hadoop.util.Shell.runCommand(Shell.java:930) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at org.apache.hadoop.util.Shell.checkIsBashSupported(Shell.java:716) > at org.apache.hadoop.util.Shell.<clinit>(Shell.java:705) > at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79) > at > org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:639) > at > org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:273) > at > org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261) > at > org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:803) > at > org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:773) > at > org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:646) > at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:397) > at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:350) > at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:330) > at > org.apache.hadoop.ipc.TestRPCWaitForProxy$RpcThread.run(TestRPCWaitForProxy.java:115) > Caused by: java.lang.InterruptedException > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:503) > at java.lang.UNIXProcess.waitFor(UNIXProcess.java:264) > at org.apache.hadoop.util.Shell.runCommand(Shell.java:920) > ... 15 more > {noformat} > The original design is not desirable, as it swallowed a potential interrupt, > causing TestRPCWaitForProxy.testInterruptedWaitForProxy to fail. > Unfortunately, Java does not allow this static method to throw exception. We > should removed the static member variable, so that the method can throw the > interrupt exception. The node manager should call the static method, instead > of using the static member variable. > This fix has an associated benefit: the tests could run faster, because it > will no longer need to spawn a bash process when it uses a Shell static > method variable (which happens quite often for checking what operating system > Hadoop is running on) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org