After talking with the Vagrant community I decided I was being too clever
trying to run the datanodes on a separate subnet from the master node. I
changed my configuration to have all three hosts on the same subnet and
everything works just as expected.

Thanks for all your help and input.

> Vinay,
> There is no gateway to 51.*. These are IP addresses that I set in my
> Vagrantfile for virtualbox as part of a private network:
> "private_network", ip:
> This allows me to spin up all the hosts for my cluster automatically and
> know that they always have the same IP addresses.
> From hadoop-data1 ( I have unrestricted access to
> hadoop-master (
> hadoop@hadoop-data1:~$ ifconfig
> eth1      Link encap:Ethernet  HWaddr 08:00:27:b9:55:25
>           inet addr:  Bcast:  Mask:
> hadoop@hadoop-data1:~$ ping hadoop-master
> PING hadoop-master ( 56(84) bytes of data.
> 64 bytes from hadoop-master ( icmp_seq=1 ttl=63 time=3.13 ms
> 64 bytes from hadoop-master ( icmp_seq=2 ttl=63 time=2.72 ms
> I'm not sure I understand exactly what you're asking for, but from the
> master I can run this
> vagrant@hadoop-master:~$ sudo netstat -tnulp | grep 54310
> tcp        0      0 *
> LISTEN      22944/java
> I understand what you're saying about a gateway often existing at that
> address for a subnet. I'm not familiar enough with Vagrant to answer this
> right now, but I will put in a question there.
> I can also change the other two IP addresses to be on the same 51. subnet.
> I may try that next.
>> might be gateway to 51.* subnet right?
>> Can you verify whether connections from outside 51 subnet, to 51.4
>> machine using other subnet IP as remote IP. ?
>> You can create any connection, may not be namenode-datanode.
>> for ex: Connection from dn to namenode should
>> result in following, when checked using netstat command in namenode
>> machine. "netstat -tnulp | grep <NN_RPC_PORT>"
>> Output should be something like below
>> tcp        0      0
>>     LISTEN      -
>> If the Foreign Ip is listing as instead of,
>> then the gateway, is not passing original client IP forward, its
>> re-creating connections with its own IP. in such case problem will be with
>> the gateway.
>> Its just a guess, reality could be different.
>> please check and let me know.
>> -Vinay
>>> Thanks to Namikaze pointing out that I should have sent the namenode log
>>> as a pastbin
>>>> I have posted the namenode logs here:
>>>> Thanks for all the help.
>>>>> Thanks for sharing the logs.
>>>>> Problem is interesting..can you please post namenode logs and dual IP
>>>>> configurations(thinking problem with gateway while sending requests from
>>>>> 52.1 segment to 51.1 segment..)
>>>>> hadoop-master
>>>>> hadoop-data1
>>>>> hadoop-data2
>>>>> sorry,I am not able to access the logs, could please post in paste bin
>>>>> or attach the as your query is why different IP) DN
>>>>> logs and namenode logs here..?
>>>>> Brahma,
>>>>> Thanks for the reply. I'll keep this conversation here in the user
>>>>> list. The /etc/hosts file is identical on all three nodes
>>>>> hadoop@hadoop-data1:~$ cat /etc/hosts
>>>>> localhost
>>>>> hadoop-master
>>>>> hadoop-data1
>>>>> hadoop-data2
>>>>> hadoop@hadoop-data2:~$ cat /etc/hosts
>>>>> localhost
>>>>> hadoop-master
>>>>> hadoop-data1
>>>>> hadoop-data2
>>>>> hadoop@hadoop-master:~$ cat /etc/hosts
>>>>> localhost
>>>>> hadoop-master
>>>>> hadoop-data1
>>>>> hadoop-data2
>>>>> Here are the startup logs for all three nodes:
>>>>> Thanks for your help.
>>>>> Seems DN started in three machines and failed in
>>>>> hadoop-data1(
>>>>> : giving IP as <>...can
>>>>> you please check /etc/hosts file of (might be
>>>>> <> is configured in /etc/hosts)
>>>>> : datanode startup might be failed ( you can check this
>>>>> node logs)
>>>>> :  <> Datanode starup is
>>>>> success..which is in master node..
>>>>> I'm still stuck on this and posted it to stackoverflow:
>>>>> Thanks,
>>>>> Daniel
>>>>> I could really use some help here. As you can see from the output
>>>>> below, the two attached datanodes are identified with a non-existent IP
>>>>> address. Can someone tell me how that gets selected or how to explicitly
>>>>> set it. Also, why are both datanodes shown under the same name/IP?
>>>>> hadoop@hadoop-master:~$ hdfs dfsadmin -report
>>>>> Configured Capacity: 84482326528 (78.68 GB)
>>>>> Present Capacity: 75745546240 (70.54 GB)
>>>>> DFS Remaining: 75744862208 (70.54 GB)
>>>>> DFS Used: 684032 (668 KB)
>>>>> DFS Used%: 0.00%
>>>>> Under replicated blocks: 0
>>>>> Blocks with corrupt replicas: 0
>>>>> Missing blocks: 0
>>>>> Missing blocks (with replication factor 1): 0
>>>>> -------------------------------------------------
>>>>> Live datanodes (2):
>>>>> Name: (
>>>>> Hostname: hadoop-data1
>>>>> Decommission Status : Normal
>>>>> Configured Capacity: 42241163264 (39.34 GB)
>>>>> DFS Used: 303104 (296 KB)
>>>>> Non DFS Used: 4302479360 (4.01 GB)
>>>>> DFS Remaining: 37938380800 (35.33 GB)
>>>>> DFS Used%: 0.00%
>>>>> DFS Remaining%: 89.81%
>>>>> Configured Cache Capacity: 0 (0 B)
>>>>> Cache Used: 0 (0 B)
>>>>> Cache Remaining: 0 (0 B)
>>>>> Cache Used%: 100.00%
>>>>> Cache Remaining%: 0.00%
>>>>> Xceivers: 1
>>>>> Last contact: Fri Sep 25 13:25:37 UTC 2015
>>>>> Name: (hadoop-master)
>>>>> Hostname: hadoop-master
>>>>> Decommission Status : Normal
>>>>> Configured Capacity: 42241163264 (39.34 GB)
>>>>> DFS Used: 380928 (372 KB)
>>>>> Non DFS Used: 4434300928 (4.13 GB)
>>>>> DFS Remaining: 37806481408 (35.21 GB)
>>>>> DFS Used%: 0.00%
>>>>> DFS Remaining%: 89.50%
>>>>> Configured Cache Capacity: 0 (0 B)
>>>>> Cache Used: 0 (0 B)
>>>>> Cache Remaining: 0 (0 B)
>>>>> Cache Used%: 100.00%
>>>>> Cache Remaining%: 0.00%
>>>>> Xceivers: 1
>>>>> Last contact: Fri Sep 25 13:25:38 UTC 2015
>>>>> The IP address is clearly wrong, but I'm not sure how it gets set. Can
>>>>> someone tell me how to configure it to choose a valid IP address?
>>>>> I just noticed that both datanodes appear to have chosen that IP
>>>>> address and bound that port for HDFS communication.
>>>>> Any idea why this would be? Is there some way to specify which
>>>>> IP/hostname should be used for that?
>>>>> When I try to run a map reduce example, I get the following error:
>>>>> hadoop@hadoop-master:~$ hadoop jar
>>>>> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
>>>>> pi 10 30
>>>>> Number of Maps  = 10
>>>>> Samples per Map = 30
>>>>> 15/09/24 20:04:28 INFO hdfs.DFSClient: Exception in
>>>>> createBlockOutputStream
>>>>> Got error, status message , ack with firstBadLink
>>>>> as
>>>>>         at
>>>>> org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(
>>>>>         at
>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(
>>>>>         at
>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(
>>>>>         at
>>>>> org.apache.hadoop.hdfs.DFSOutputStream$
>>>>> 15/09/24 20:04:28 INFO hdfs.DFSClient: Abandoning
>>>>> BP-852923283-
>>>>> 15/09/24 20:04:28 INFO hdfs.DFSClient: Excluding datanode
>>>>> DatanodeInfoWithStorage[
>>>>> ,DS-45f6e06d-752e-41e8-ac25-ca88bce80d00,DISK]
>>>>> 15/09/24 20:04:28 WARN hdfs.DFSClient: Slow waitForAckedSeqno took
>>>>> 65357ms (threshold=30000ms)
>>>>> Wrote input for Map #0
>>>>> I'm not sure why it's trying to access, which
>>>>> isn't even a valid IP address in my setup.
>>>>> Daniel

