I have an update - and it's a weird one. In the course of trying to fix my other problem (http://mail-archives.apache.org/mod_mbox/hadoop-user/201806.mbox/ajax/%3C3276cbc6-6f04-9dcb-9be0-186334a0cf7a%40att.net%3E) I hit upon some strange behavior germane to this problem.

I tried to cut down my 3.1.0 cluster to a two-node one in hopes of eliminating the dual-homed machine from the cluster to see if that restored proper operation. When I tried to run start-dfs.sh I got something to the effect of:

   hostname msba02b,msba02c not found

In other words, it was taking the output of my sed command (as shown below; a comma-separated list of the host names in the workers file, as expected by the --hostnames option) and interpreting the whole thing as a single hostname.

So I thought, is the hdfs datanode command not liking the absence of periods, since my host tables and Hadoop configs now contain atomic hostnames? I made up a fake FQDN for everything; didn't make any difference.

Finally, ***removing the --hostnames option entirely from the command*** - making it look like it did in the original distribution - did the trick.

To me, this makes no sense at all; how could the hdfs datanode command work one way when the machines use dynamic DNS and another way when using host tables for name resolution?






On 6/6/18 2:11 PM, Jeff Hubbs wrote:

Just as an FYI and hopefully leading to a bug-fix: the command to start Datanode daemons in .../sbin/start-dfs.sh and .../sbin/stop-dfs.sh from the prebuilt Hadoop distribution (both 3.0.1 and 3.1.0) won't run as written.

Here's the command that errors out:

    hadoop_uservar_su hdfs datanode "${HADOOP_HDFS_HOME}/bin/hdfs" \
        --workers \
        --config "${HADOOP_CONF_DIR}" \
        --daemon start \
        datanode ${dataStartOpt}

What happens is that that it thinks the path to the workers file is the network name of a datanode.

To get this to work properly, I use the --hostnames option and supply as a value a space-delimited version of the one-name-per-line workers file like so:

hadoop_uservar_su hdfs datanode "${HADOOP_HDFS_HOME}/bin/hdfs" \
    --workers \
    --hostnames `sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/,/g' ${HADOOP_CONF_DIR}/workers` \
    --config "${HADOOP_CONF_DIR}" \
    --daemon start \
    datanode ${dataStartOpt}



Reply via email to