I have an update - and it's a weird one. In the course of trying to fix
my other problem
(http://mail-archives.apache.org/mod_mbox/hadoop-user/201806.mbox/ajax/%3C3276cbc6-6f04-9dcb-9be0-186334a0cf7a%40att.net%3E)
I hit upon some strange behavior germane to this problem.
I tried to cut down my 3.1.0 cluster to a two-node one in hopes of
eliminating the dual-homed machine from the cluster to see if that
restored proper operation. When I tried to run start-dfs.sh I got
something to the effect of:
hostname msba02b,msba02c not found
In other words, it was taking the output of my sed command (as shown
below; a comma-separated list of the host names in the workers file, as
expected by the --hostnames option) and interpreting the whole thing as
a single hostname.
So I thought, is the hdfs datanode command not liking the absence of
periods, since my host tables and Hadoop configs now contain atomic
hostnames? I made up a fake FQDN for everything; didn't make any difference.
Finally, ***removing the --hostnames option entirely from the command***
- making it look like it did in the original distribution - did the trick.
To me, this makes no sense at all; how could the hdfs datanode command
work one way when the machines use dynamic DNS and another way when
using host tables for name resolution?
On 6/6/18 2:11 PM, Jeff Hubbs wrote:
Just as an FYI and hopefully leading to a bug-fix: the command to
start Datanode daemons in .../sbin/start-dfs.sh and
.../sbin/stop-dfs.sh from the prebuilt Hadoop distribution (both 3.0.1
and 3.1.0) won't run as written.
Here's the command that errors out:
hadoop_uservar_su hdfs datanode "${HADOOP_HDFS_HOME}/bin/hdfs" \
--workers \
--config "${HADOOP_CONF_DIR}" \
--daemon start \
datanode ${dataStartOpt}
What happens is that that it thinks the path to the workers file is
the network name of a datanode.
To get this to work properly, I use the --hostnames option and supply
as a value a space-delimited version of the one-name-per-line workers
file like so:
hadoop_uservar_su hdfs datanode "${HADOOP_HDFS_HOME}/bin/hdfs" \
--workers \
--hostnames `sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/,/g'
${HADOOP_CONF_DIR}/workers` \
--config "${HADOOP_CONF_DIR}" \
--daemon start \
datanode ${dataStartOpt}