Stephen O'Donnell created HDFS-15919:
----------------------------------------

             Summary: BlockPoolManager should log stack trace if unable to get 
Namenode addresses
                 Key: HDFS-15919
                 URL: https://issues.apache.org/jira/browse/HDFS-15919
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: datanode
    Affects Versions: 3.4.0
            Reporter: Stephen O'Donnell
            Assignee: Stephen O'Donnell


If the hdfs config is badly configured, the datanode can fail to start with 
this stack trace:

{code}
2021-03-24 05:58:27,026 INFO  datanode.DataNode 
(BlockPoolManager.java:refreshNamenodes(149)) - Refresh request received for 
nameservices: null
2021-03-24 05:58:27,033 WARN  datanode.DataNode 
(BlockPoolManager.java:refreshNamenodes(161)) - Unable to get NameNode 
addresses.
...
2021-03-24 05:58:27,077 ERROR datanode.DataNode 
(DataNode.java:secureMain(2883)) - Exception in secureMain
java.io.IOException: No services to connect, missing NameNode address.
        at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolManager.refreshNamenodes(BlockPoolManager.Java:165)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1440)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:500)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2782)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2690)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2732)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2876)
        at 
org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:100)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)
{code}

In this case, the issue was an exception thrown in 
DFSUtil.getNNServiceRpcAddressesForCluster(...) but there are a couple of 
scenarios within it which can cause an exception, so its difficult to figure 
out what is wrong with the config.

We should simple add the exception onto the existing log message when an error 
occurs so it is clear what caused it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to