Stephen O'Donnell created HDFS-15919: ----------------------------------------
Summary: BlockPoolManager should log stack trace if unable to get Namenode addresses Key: HDFS-15919 URL: https://issues.apache.org/jira/browse/HDFS-15919 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 3.4.0 Reporter: Stephen O'Donnell Assignee: Stephen O'Donnell If the hdfs config is badly configured, the datanode can fail to start with this stack trace: {code} 2021-03-24 05:58:27,026 INFO datanode.DataNode (BlockPoolManager.java:refreshNamenodes(149)) - Refresh request received for nameservices: null 2021-03-24 05:58:27,033 WARN datanode.DataNode (BlockPoolManager.java:refreshNamenodes(161)) - Unable to get NameNode addresses. ... 2021-03-24 05:58:27,077 ERROR datanode.DataNode (DataNode.java:secureMain(2883)) - Exception in secureMain java.io.IOException: No services to connect, missing NameNode address. at org.apache.hadoop.hdfs.server.datanode.BlockPoolManager.refreshNamenodes(BlockPoolManager.Java:165) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1440) at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:500) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2782) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2690) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2732) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2876) at org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:100) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243) {code} In this case, the issue was an exception thrown in DFSUtil.getNNServiceRpcAddressesForCluster(...) but there are a couple of scenarios within it which can cause an exception, so its difficult to figure out what is wrong with the config. We should simple add the exception onto the existing log message when an error occurs so it is clear what caused it. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org