[ https://issues.apache.org/jira/browse/HADOOP-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157288#comment-15157288 ]
Eric Badger commented on HADOOP-8717: ------------------------------------- Starting a pseudo-distributed cluster on Mac OS X El Capitan using the sbin/start-yarn.sh script encounters the JAVA_HOME issue on Hadoop 2.7.3. The culprit is the slaves.sh script, which is called by yarn-daemons.sh in a for loop over all of the slave nodes. slaves.sh ssh's into each slave machine and then runs the sbin/yarn-daemon.sh script to start the NM. In Pseudo-distributed mode, this will just be localhost. Running sbin/yarn-daemon.sh manually to start the NM on localhost works fine, but running it indirectly through start-yarn.sh -> yarn-daemons.sh -> slaves.sh seems to unset or clear JAVA_HOME. I have not dug too far into the NM code, but it would make sense for JAVA_HOME to be unset/cleared, since the NM fails when starting containers because it can't find "/bin/java" instead of "/somepath/bin/java". Steps to reproduce failure. This will fail with exit code 127 due to the container failing to start. {noformat} $HADOOP_PREFIX/bin/hdfs namenode -format; $HADOOP_PREFIX/sbin/start-dfs.sh; $HADOOP_PREFIX/sbin/start-yarn.sh; {noformat} Running The following sleep job will give an error that the container failed to start. {noformat} $HADOOP_PREFIX/bin/hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar sleep -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1 -rt 1 {noformat} {noformat} 2016-02-22 10:55:58,142 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1449)) - Job job_1456160109510_0001 failed with state FAILED due to: Application application_1456160109510_0001 failed 3 times due to AM Container for appattempt_1456160109510_0001_000003 exited with exitCode: 127 For more detailed output, check application tracking page:http://localhost:8088/cluster/app/application_1456160109510_0001Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_e53_1456160109510_0001_03_000001 Exit code: 127 Stack trace: ExitCodeException exitCode=127: at org.apache.hadoop.util.Shell.runCommand(Shell.java:545) at org.apache.hadoop.util.Shell.run(Shell.java:456) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 127 Failing this attempt. Failing the application. 2016-02-22 10:55:58,158 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1454)) - Counters: 0 {noformat} Steps that set up pseudo-distributed correctly. Running a sleep job using this setup will succeed. {noformat} $HADOOP_PREFIX/bin/hdfs namenode -format; $HADOOP_PREFIX/sbin/start-dfs.sh; $HADOOP_PREFIX/sbin/yarn-daemon.sh start resourcemanager; $HADOOP_PREFIX/sbin/yarn-daemon.sh start nodemanager; {noformat} > JAVA_HOME detected in hadoop-config.sh under OS X does not work > --------------------------------------------------------------- > > Key: HADOOP-8717 > URL: https://issues.apache.org/jira/browse/HADOOP-8717 > Project: Hadoop Common > Issue Type: Bug > Components: conf, scripts > Environment: OS: Darwin 11.4.0 Darwin Kernel Version 11.4.0: Mon Apr > 9 19:32:15 PDT 2012; root:xnu-1699.26.8~1/RELEASE_X86_64 x86_64 > java version "1.6.0_33" > Java(TM) SE Runtime Environment (build 1.6.0_33-b03-424-11M3720) > Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03-424, mixed mode) > Reporter: Jianbin Wei > Assignee: Jianbin Wei > Priority: Minor > Labels: newbie, scripts > Attachments: HADOOP-8717.patch, HADOOP-8717.patch, HADOOP-8717.patch, > HADOOP-8717.patch > > > After setting up a single node hadoop on mac, copy some text file to it and > run > $ hadoop jar > ./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-SNAPSHOT.jar > wordcount /file.txt output > It reports > 12/08/21 15:32:18 INFO Job.java:mapreduce.Job:1265: Running job: > job_1345588312126_0001 > 12/08/21 15:32:22 INFO Job.java:mapreduce.Job:1286: Job > job_1345588312126_0001 running in uber mode : false > 12/08/21 15:32:22 INFO Job.java:mapreduce.Job:1293: map 0% reduce 0% > 12/08/21 15:32:22 INFO Job.java:mapreduce.Job:1306: Job > job_1345588312126_0001 failed with state FAILED due to: Application > application_1345588312126_0001 failed 1 times due to AM Container for > appattempt_1345588312126_0001_000001 exited with exitCode: 127 due to: > .Failing this attempt.. Failing the application. > 12/08/21 15:32:22 INFO Job.java:mapreduce.Job:1311: Counters: 0 > $ cat > /tmp/logs/application_1345588312126_0001/container_1345588312126_0001_01_000001/stderr > /bin/bash: /bin/java: No such file or directory > The detected JAVA_HOME is not used somehow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)