[ 
https://issues.apache.org/jira/browse/HDFS-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Biju Nair updated HDFS-7810:
----------------------------
    Description: 
When a new DN is added to the cluster, the registration process fails. The 
following are the steps followed.

- Install and start a new DN
- Add entry for the DN in the NN {{/etc/hosts}} file

DN log shows that the registration process failed

- Tried to restart DN with the same result

Since all the DNs have multiple NW interface, we are using the following 
{{hdfs-site.xml}} property, instead of listing all the 
{{dfs.datanode.xx.address}} properties.

{code:xml}
  <property>
    <name>dfs.datanode.dns.interface</name>
    <value>eth2</value>
  </property>
{code}

- Restarting the NN resolves the issue with registration which is not desired. 
- Adding the following {{dfs.datanode.xx.address}} properties seem to resolve 
DN registration process without NN restart. But this is a different behavior 
compared to *hadoop 2.2*. Is there a reason for the change?

{code:xml}
  <property>
    <name>dfs.datanode.address</name>
    <value>192.168.0.12:50010</value>
  </property>

  <property>
    <name>dfs.datanode.ipc.address</name>
    <value>192.168.0.12:50020</value>
  </property>

  <property>
    <name>dfs.datanode.http.address</name>
    <value>192.168.0.12:50075</value>
  </property>
{code}

*NN Log Error Entry*
{quote}
2015-02-17 12:21:53,583 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 
on 8020, call 
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 
192.168.100.13:37516 Call#1027 Retry#0 
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode 
denied communication with namenode because hostname cannot be resolved 
(ip=192.168.100.13, hostname=192.168.100.13): DatanodeRegistration(0.0.0.0, 
datanodeUuid=bd23eb3c-a5b9-43e4-ad23-1683346564ac, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-56;cid=CID-02099252-fbca-4bf2-b466-9a0ed67e53a3;nsid=2048643132;c=0)
 
at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:887)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5002)
 
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1065)
 
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
 
at 
org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378)
 
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
 
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 
2015-02-17 12:21:58,607 WARN 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved 
datanode registration: hostname cannot be resolved (ip=192.168.100.13, 
hostname=192.168.100.13) 
{quote}

*DN Log Error Entry*
{quote}
2015-02-17 12:21:02,994 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Block pool BP-1782713777-10.0.100.11-1424188575377 (Datanode Uuid null) service 
to f-bcpc-vm1/192.168.100.11:8020 beginning handshake with NN 
2015-02-17 12:21:03,006 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
Initialization failed for Block pool BP-1782713777-10.0.100.11-1424188575377 
(Datanode Uuid null) service to f-bcpc-vm1/192.168.100.11:8020 Datanode denied 
communication with namenode because hostname cannot be resolved 
(ip=192.168.100.13, hostname=192.168.100.13): DatanodeRegistration(0.0.0.0, 
datanodeUuid=bd23eb3c-a5b9-43e4-ad23-1683346564ac, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-56;cid=CID-02099252-fbca-4bf2-b466-9a0ed67e53a3;nsid=2048643132;c=0)
 
at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:887)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5002)
 
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1065)
 
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
 
at 
org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378)
 
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
 
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
{quote}

  was:
When a new DN is added to the cluster, the registration process fails. The 
following are the steps followed.

- Install and start a new DN
- Add entry for the DN in the NN {{/etc/hosts}} file

DN log shows that the registration process failed

- Tried to restart DN with the same result

Since all the DNs have multiple NW interface, we are using the following 
{{hdfs-site.xml}} property, instead of listing all the 
{{dfs.datanode.xx.address}} properties.

{code:xml}
  <property>
    <name>dfs.datanode.dns.interface</name>
    <value>eth2</value>
  </property>
{code}

- Restarting the NN resolves the issue with registration which is not desired. 
- Adding the following {{dfs.datanode.xx.address}} properties seem to resolve 
DN registration process without NN restart. But this is a different behavior 
compared to *hadoop 2.2*. Is there a reason for the change?

{code:xml}
  <property>
    <name>dfs.datanode.address</name>
    <value>192.168.0.12:50010</value>
  </property>

  <property>
    <name>dfs.datanode.ipc.address</name>
    <value>192.168.0.12:50020</value>
  </property>

  <property>
    <name>dfs.datanode.http.address</name>
    <value>192.168.0.12:50075</value>
  </property>
{code}

*NN Log Error Entry*
{{2015-02-17 12:21:53,583 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
6 on 8020, call 
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 
192.168.100.13:37516 Call#1027 Retry#0 
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode 
denied communication with namenode because hostname cannot be resolved 
(ip=192.168.100.13, hostname=192.168.100.13): DatanodeRegistration(0.0.0.0, 
datanodeUuid=bd23eb3c-a5b9-43e4-ad23-1683346564ac, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-56;cid=CID-02099252-fbca-4bf2-b466-9a0ed67e53a3;nsid=2048643132;c=0)
 
at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:887)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5002)
 
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1065)
 
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
 
at 
org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378)
 
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
 
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 
2015-02-17 12:21:58,607 WARN 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved 
datanode registration: hostname cannot be resolved (ip=192.168.100.13, 
hostname=192.168.100.13) }}

*DN Log Error Entry*
{{2015-02-17 12:21:02,994 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Block pool BP-1782713777-10.0.100.11-1424188575377 (Datanode Uuid null) service 
to f-bcpc-vm1/192.168.100.11:8020 beginning handshake with NN 
2015-02-17 12:21:03,006 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
Initialization failed for Block pool BP-1782713777-10.0.100.11-1424188575377 
(Datanode Uuid null) service to f-bcpc-vm1/192.168.100.11:8020 Datanode denied 
communication with namenode because hostname cannot be resolved 
(ip=192.168.100.13, hostname=192.168.100.13): DatanodeRegistration(0.0.0.0, 
datanodeUuid=bd23eb3c-a5b9-43e4-ad23-1683346564ac, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-56;cid=CID-02099252-fbca-4bf2-b466-9a0ed67e53a3;nsid=2048643132;c=0)
 
at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:887)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5002)
 
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1065)
 
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
 
at 
org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378)
 
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
 
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)}}


> Datanode registration process fails in hadoop 2.6 
> --------------------------------------------------
>
>                 Key: HDFS-7810
>                 URL: https://issues.apache.org/jira/browse/HDFS-7810
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, namenode
>    Affects Versions: 2.6.0
>         Environment: ubuntu 12
>            Reporter: Biju Nair
>              Labels: hadoop
>
> When a new DN is added to the cluster, the registration process fails. The 
> following are the steps followed.
> - Install and start a new DN
> - Add entry for the DN in the NN {{/etc/hosts}} file
> DN log shows that the registration process failed
> - Tried to restart DN with the same result
> Since all the DNs have multiple NW interface, we are using the following 
> {{hdfs-site.xml}} property, instead of listing all the 
> {{dfs.datanode.xx.address}} properties.
> {code:xml}
>   <property>
>     <name>dfs.datanode.dns.interface</name>
>     <value>eth2</value>
>   </property>
> {code}
> - Restarting the NN resolves the issue with registration which is not 
> desired. 
> - Adding the following {{dfs.datanode.xx.address}} properties seem to resolve 
> DN registration process without NN restart. But this is a different behavior 
> compared to *hadoop 2.2*. Is there a reason for the change?
> {code:xml}
>   <property>
>     <name>dfs.datanode.address</name>
>     <value>192.168.0.12:50010</value>
>   </property>
>   <property>
>     <name>dfs.datanode.ipc.address</name>
>     <value>192.168.0.12:50020</value>
>   </property>
>   <property>
>     <name>dfs.datanode.http.address</name>
>     <value>192.168.0.12:50075</value>
>   </property>
> {code}
> *NN Log Error Entry*
> {quote}
> 2015-02-17 12:21:53,583 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 6 on 8020, call 
> org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 
> 192.168.100.13:37516 Call#1027 Retry#0 
> org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode 
> denied communication with namenode because hostname cannot be resolved 
> (ip=192.168.100.13, hostname=192.168.100.13): DatanodeRegistration(0.0.0.0, 
> datanodeUuid=bd23eb3c-a5b9-43e4-ad23-1683346564ac, infoPort=50075, 
> ipcPort=50020, 
> storageInfo=lv=-56;cid=CID-02099252-fbca-4bf2-b466-9a0ed67e53a3;nsid=2048643132;c=0)
>  
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:887)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5002)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1065)
>  
> at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
>  
> at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378)
>  
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>  
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>  
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 
> 2015-02-17 12:21:58,607 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved 
> datanode registration: hostname cannot be resolved (ip=192.168.100.13, 
> hostname=192.168.100.13) 
> {quote}
> *DN Log Error Entry*
> {quote}
> 2015-02-17 12:21:02,994 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Block pool BP-1782713777-10.0.100.11-1424188575377 (Datanode Uuid null) 
> service to f-bcpc-vm1/192.168.100.11:8020 beginning handshake with NN 
> 2015-02-17 12:21:03,006 ERROR 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for 
> Block pool BP-1782713777-10.0.100.11-1424188575377 (Datanode Uuid null) 
> service to f-bcpc-vm1/192.168.100.11:8020 Datanode denied communication with 
> namenode because hostname cannot be resolved (ip=192.168.100.13, 
> hostname=192.168.100.13): DatanodeRegistration(0.0.0.0, 
> datanodeUuid=bd23eb3c-a5b9-43e4-ad23-1683346564ac, infoPort=50075, 
> ipcPort=50020, 
> storageInfo=lv=-56;cid=CID-02099252-fbca-4bf2-b466-9a0ed67e53a3;nsid=2048643132;c=0)
>  
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:887)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5002)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1065)
>  
> at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
>  
> at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378)
>  
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>  
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>  
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to