As a follow up, a place you would typically set the JAVA_HOME environment variable would be in /etc/default/mesos-slave on Ubuntu
On Wed, Nov 4, 2015 at 11:38 AM, Elizabeth Lingg <elizab...@mesosphere.io> wrote: > Ah, yes. I have seen this issue. Typically, it is because you have > JAVA_HOME set on your host,but not on your Mesos Agent. If you run a > Marathon job and output "env" you will see the JAVA_HOME environment > variable is missing. You would need to set it in your agent init > configurations as export JAVA_HOME=<pathtojava> > > Thanks, > Elizabeth > > On Wed, Nov 4, 2015 at 1:20 AM, haosdent <haosd...@gmail.com> wrote: > >> how about add this flag when launch slave >> --executor_environment_variables='{"HADOOP_HOME": "/opt/hadoop-2.6.0"}' >> ? >> >> On Wed, Nov 4, 2015 at 5:13 PM, Du, Fan <fan...@intel.com> wrote: >> >>> >>> >>> On 2015/11/4 17:09, haosdent wrote: >>> >>>> I notice >>>> ``` >>>> "user":"root" >>>> ``` >>>> Do you make sure could execute `hadoop version` under root? >>>> >>> >>> >>> [root@tylersburg spark-1.5.1-bin-hadoop2.6]# whoami >>> root >>> [root@tylersburg spark-1.5.1-bin-hadoop2.6]# hadoop version >>> Hadoop 2.6.0 >>> Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r >>> e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1 >>> Compiled by jenkins on 2014-11-13T21:10Z >>> Compiled with protoc 2.5.0 >>> From source with checksum 18e43357c8f927c0695f1e9522859d6a >>> This command was run using >>> /opt/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0.jar >>> >>> [root@tylersburg spark-1.5.1-bin-hadoop2.6]# ls -hl >>> /opt/hadoop-2.6.0/bin/hadoop >>> -rwxr-xr-x. 1 root root 5.4K Nov 3 08:36 /opt/hadoop-2.6.0/bin/hadoop >>> >>> >>> >>> On Wed, Nov 4, 2015 at 4:56 PM, Du, Fan <fan...@intel.com >>>> <mailto:fan...@intel.com>> wrote: >>>> >>>> >>>> >>>> On 2015/11/4 16:40, Tim Chen wrote: >>>> >>>> What OS are you running this with? >>>> >>>> And I assume if you run /bin/sh and try to run hadoop it can be >>>> found in >>>> your PATH as well? >>>> >>>> >>>> I'm using CentOS-7.2 >>>> >>>> # /bin/sh hadoop version >>>> Hadoop 2.6.0 >>>> Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r >>>> e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1 >>>> Compiled by jenkins on 2014-11-13T21:10Z >>>> Compiled with protoc 2.5.0 >>>> >From source with checksum 18e43357c8f927c0695f1e9522859d6a >>>> This command was run using >>>> /opt/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0.jar >>>> >>>> >>>> >>>> Tim >>>> >>>> On Wed, Nov 4, 2015 at 12:34 AM, Du, Fan <fan...@intel.com >>>> <mailto:fan...@intel.com> >>>> <mailto:fan...@intel.com <mailto:fan...@intel.com>>> wrote: >>>> >>>> Hi Mesos experts >>>> >>>> I setup a small mesos cluster with 1 master and 6 slaves, >>>> and deploy hdfs on the same cluster topology, both with >>>> root user role. >>>> >>>> #cat spark-1.5.1-bin-hadoop2.6/conf/spark-env.sh >>>> export MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos.so >>>> export >>>> >>>> >>>> JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.91-2.6.2.1.el7_1.x86_64/jre/ >>>> export >>>> SPARK_EXECUTOR_URI=hdfs://test/spark-1.5.1-bin-hadoop2.6.tgz >>>> >>>> When I run a simple SparkPi test >>>> #export MASTER=mesos://Mesos_Master_IP:5050 >>>> #spark-1.5.1-bin-hadoop2.6/bin/run-example SparkPi 10000 >>>> >>>> I got this on slaves: >>>> >>>> I1104 22:24:02.238471 14518 fetcher.cpp:414] Fetcher Info: >>>> >>>> >>>> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/556b49c1-7e6a-4f99-b320-c3f0c849e836-S6\/root","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"hdfs:\/\/test\/spark-1.5.1-bin-hadoop2.6.tgz"}}],"sandbox_directory":"\/ws\/mesos\/slaves\/556b49c1-7e6a-4f99-b320-c3f0c849e836-S6\/frameworks\/556b49c1-7e6a-4f99-b320-c3f0c849e836-0003\/executors\/556b49c1-7e6a-4f99-b320-c3f0c849e836-S6\/runs\/9ec70f41-67d5-4a95-999f-933f3aa9e261","user":"root"} >>>> I1104 22:24:02.240910 14518 fetcher.cpp:369] Fetching URI >>>> 'hdfs://test/spark-1.5.1-bin-hadoop2.6.tgz' >>>> I1104 22:24:02.240931 14518 fetcher.cpp:243] Fetching >>>> directly into >>>> the sandbox directory >>>> I1104 22:24:02.240952 14518 fetcher.cpp:180] Fetching URI >>>> 'hdfs://test/spark-1.5.1-bin-hadoop2.6.tgz' >>>> E1104 22:24:02.245264 14518 shell.hpp:90] Command 'hadoop >>>> version >>>> 2>&1' failed; this is the output: >>>> sh: hadoop: command not found >>>> Failed to fetch >>>> 'hdfs://test/spark-1.5.1-bin-hadoop2.6.tgz': >>>> Skipping fetch with Hadoop client: Failed to execute >>>> 'hadoop version >>>> 2>&1'; the command was either not found or exited with a >>>> non-zero >>>> exit status: 127 >>>> Failed to synchronize with slave (it's probably exited) >>>> >>>> >>>> As for "sh: hadoop: command not found", it indicates when >>>> mesos >>>> executes "hadoop version" command, >>>> it cannot find any valid hadoop command, but actually when >>>> I log >>>> into the slave, "hadoop vesion" >>>> runs well, because I update hadoop path into PATH env. >>>> >>>> cat ~/.bashrc >>>> export >>>> >>>> >>>> JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.91-2.6.2.1.el7_1.x86_64/jre/ >>>> export HADOOP_PREFIX=/opt/hadoop-2.6.0 >>>> export HADOOP_HOME=$HADOOP_PREFIX >>>> export HADOOP_COMMON_HOME=$HADOOP_PREFIX >>>> export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop >>>> export HADOOP_HDFS_HOME=$HADOOP_PREFIX >>>> export HADOOP_MAPRED_HOME=$HADOOP_PREFIX >>>> export HADOOP_YARN_HOME=$HADOOP_PREFIX >>>> export PATH=$PATH:$HADOOP_PREFIX/sbin:$HADOOP_PREFIX/bin >>>> >>>> I also try to set hadoop_home when launching mesos-slave, >>>> hmm, no >>>> luck, the slave >>>> complains it can find JAVA_HOME env when executing "hadoop >>>> version" >>>> >>>> Finally I check the Mesos code where this error happens, it >>>> looks >>>> quite straight forward. >>>> >>>> ./src/hdfs/hdfs.hpp >>>> 44 // HTTP GET on hostname:port and grab the information >>>> in the >>>> 45 // <title>...</title> (this is the best hack I can >>>> think of to get >>>> 46 // 'fs.default.name <http://fs.default.name> >>>> <http://fs.default.name>' given the tools >>>> >>>> available). >>>> 47 struct HDFS >>>> 48 { >>>> 49 // Look for `hadoop' first where proposed, >>>> otherwise, look for >>>> 50 // HADOOP_HOME, otherwise, assume it's on the PATH. >>>> 51 explicit HDFS(const std::string& _hadoop) >>>> 52 : hadoop(os::exists(_hadoop) >>>> 53 ? _hadoop >>>> 54 : (os::getenv("HADOOP_HOME").isSome() >>>> 55 ? >>>> path::join(os::getenv("HADOOP_HOME").get(), >>>> "bin/hadoop") >>>> 56 : "hadoop")) {} >>>> 57 >>>> 58 // Look for `hadoop' in HADOOP_HOME or assume it's >>>> on the PATH. >>>> 59 HDFS() >>>> 60 : hadoop(os::getenv("HADOOP_HOME").isSome() >>>> 61 ? >>>> path::join(os::getenv("HADOOP_HOME").get(), >>>> "bin/hadoop") >>>> 62 : "hadoop") {} >>>> 63 >>>> 64 // Check if hadoop client is available at the path >>>> that was set. >>>> 65 // This can be done by executing `hadoop version` >>>> command and >>>> 66 // checking for status code == 0. >>>> 67 Try<bool> available() >>>> 68 { >>>> 69 Try<std::string> command = strings::format("%s >>>> version", >>>> hadoop); >>>> 70 >>>> 71 CHECK_SOME(command); >>>> 72 >>>> 73 // We are piping stderr to stdout so that we can >>>> see the >>>> error (if >>>> 74 // any) in the logs emitted by `os::shell()` in >>>> case of >>>> failure. >>>> 75 Try<std::string> out = os::shell(command.get() + " >>>> 2>&1"); >>>> 76 >>>> 77 if (out.isError()) { >>>> 78 return Error(out.error()); >>>> 79 } >>>> 80 >>>> 81 return true; >>>> 82 } >>>> >>>> It puzzled me for a while, am I missing something >>>> obviously? >>>> Thanks in advance. >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> Best Regards, >>>> Haosdent Huang >>>> >>> >> >> >> -- >> Best Regards, >> Haosdent Huang >> > >