[ 
https://issues.apache.org/jira/browse/HDFS-15919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-15919:
-------------------------------
    Fix Version/s: 3.2.3
                   3.4.0
                   3.3.1
     Hadoop Flags: Reviewed
       Resolution: Fixed
           Status: Resolved  (was: Patch Available)

LGTM, +1. Committed to trunk and cherry-pick clean to branch-3.3 and branch-3.2.
Thanks [~sodonnell] for your contributions! Thanks [~vjasani] and [~ayushtkn] 
for your reviews!

> BlockPoolManager should log stack trace if unable to get Namenode addresses
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-15919
>                 URL: https://issues.apache.org/jira/browse/HDFS-15919
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 3.4.0
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>             Fix For: 3.3.1, 3.4.0, 3.2.3
>
>         Attachments: HDFS-15919.001.patch
>
>
> If the hdfs config is badly configured, the datanode can fail to start with 
> this stack trace:
> {code}
> 2021-03-24 05:58:27,026 INFO  datanode.DataNode 
> (BlockPoolManager.java:refreshNamenodes(149)) - Refresh request received for 
> nameservices: null
> 2021-03-24 05:58:27,033 WARN  datanode.DataNode 
> (BlockPoolManager.java:refreshNamenodes(161)) - Unable to get NameNode 
> addresses.
> ...
> 2021-03-24 05:58:27,077 ERROR datanode.DataNode 
> (DataNode.java:secureMain(2883)) - Exception in secureMain
> java.io.IOException: No services to connect, missing NameNode address.
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolManager.refreshNamenodes(BlockPoolManager.Java:165)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1440)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:500)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2782)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2690)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2732)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2876)
>       at 
> org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:100)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)
> {code}
> In this case, the issue was an exception thrown in 
> DFSUtil.getNNServiceRpcAddressesForCluster(...) but there are a couple of 
> scenarios within it which can cause an exception, so its difficult to figure 
> out what is wrong with the config.
> We should simple add the exception onto the existing log message when an 
> error occurs so it is clear what caused it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to