[ 
https://issues.apache.org/jira/browse/HDFS-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906841#action_12906841
 ] 

Allen Wittenauer commented on HDFS-1379:
----------------------------------------

Some of the issues here are also covered in HADOOP-6364 .

But yes, multi-homing is a known brokenness.

It is probably worth pointing out that 

a) the bang-for-buck by having a separate network for IPC/RPC communications 
isn't very good, so pretty much no one does it

b) monitoring a private interface instead of the public one leaves you exposed 
to failures on the network side 

> Multihoming brokenness in HDFS
> ------------------------------
>
>                 Key: HDFS-1379
>                 URL: https://issues.apache.org/jira/browse/HDFS-1379
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client, name-node
>    Affects Versions: 0.20.1
>         Environment: Multi-homed namenode and datanodes. hadoop-0.20.1 
> (cloudera distribution on linux)
>            Reporter: Matthew Byng-Maddick
>
> We have a setup where - because we only have a very few machines (4 x 16 
> core) we're looking at co-locating namenodes and datanodes. We also have 
> front-end and back-end networks. Set-up is something like:
> * machine1
> ** front-end 10.18.80.80
> ** back-end 192.168.24.40
> * machine2
> ** front-end 10.18.80.82
> ** back-end 192.168.24.41
> * machine3
> ** front-end 10.18.80.84
> ** back-end 192.168.24.42
> * machine4
> ** front-end 10.18.80.86
> ** back-end 192.168.24.43
> On each, the property *slave.host.name* is configured with the 192. address, 
> (the *.dns.interface settings don't actually seem to help, but that's a 
> separate problem), and the *dfs.datanode.address* is bound to the 
> 192.168.24.x address on :50010, similarly the *dfs.datanode.ipc.address* is 
> bound there.
> In order to get efficient use of our machines, we bring up a namenode on one 
> of them (this then rsyncs the latest namenode fsimage etc) by bringing up a 
> VIP on each side (we use the 10.18.80.x side for monitoring, rather than 
> actual hadoop comms), and binding the namenode to that - on the inside this 
> is 192.168.24.19.
> The namenode now knows about 4 datanodes - 192.168.24.40/1/2/3. These 
> datanodes know how they're bound. However, when the datanode is telling an 
> external hdfs client where to store the blocks, it gives out 
> 192.168.24.19:50010 as one of the addresses (despite the datanode not being 
> bound there) - because that's where the datanode->namenode RPC comes from.
> This is wrong because if you've bound the datanode explicitly (using 
> *dfs.datanode.address*) then that's should be the only address the namenode 
> can give out (it's reasonable, given your comms model not to support NAT 
> between clients and data slaves). If you bind it to * then your normal rules 
> for slave.host.name, dfs.datanode.dns.interface etc should take precedence.
> This may already be fixed in later releases than 0.20.1 - but if it isn't it 
> should probably be - you explicitly allow binding of the datanode addresses - 
> it's unreasonable to expect that comms to the datanode will always come from 
> those addresses - especially in multi-homed environments (and separating 
> traffic out by network - especially when dealing with large volumes of data) 
> is useful.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to