wangzhihui created HDFS-17356:
-
Summary: RBF Add Configuration dfs.federation.router.ns.name
Optimization
Key: HDFS-17356
URL: https://issues.apache.org/jira/browse/HDFS-17356
Project: Hadoop HDFS
Issue Type: Improvement
Components: dfs, rbf
Reporter: wangzhihui
Fix For: 3.3.0
When enabling RBF federation in HDFS, when the HDFS server and RBFClient
share the same configuration and the HDFS server (NameNode、ZKFC) and RBFClient
are on the same node, the following exception occurs, causing NameNode to fail
to start; The reason is that the NS of the Router service has been added to the
dfs.nameservices list. When NameNode starts, it obtains the NS that the current
node belongs to. However, it is found that there are multiple NS that cannot be
recognized and cannot pass the verification of existing logic, ultimately
resulting in NameNode startup failure. Currently, we can only solve this
problem by isolating the hdfs-site.xml of RouterClient and NameNode. However,
grouping configuration is not conducive to our unified management of cluster
configuration. Therefore, we propose a new solution to solve this problem
better.
{code:java}
// code placeholder
2023-10-30 15:53:24,613 INFO org.apache.hadoop.hdfs.server.namenode.NameNode:
registered UNIX signal handlers for [TERM, HUP, INT]
2023-10-30 15:53:24,672 INFO org.apache.hadoop.hdfs.server.namenode.NameNode:
createNameNode []
2023-10-30 15:53:24,760 INFO org.apache.hadoop.metrics2.impl.MetricsConfig:
Loaded properties from hadoop-metrics2.properties
2023-10-30 15:53:24,842 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
Scheduled Metric snapshot period at 10 second(s).
2023-10-30 15:53:24,842 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
NameNode metrics system started
2023-10-30 15:53:24,868 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode:
Failed to start namenode.
org.apache.hadoop.HadoopIllegalArgumentException: Configuration has multiple
addresses that match local node's address. Please configure the system with
dfs.nameservice.id and dfs.ha.namenode.id
at org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs(DFSUtil.java:1257)
at org.apache.hadoop.hdfs.DFSUtil.getNameServiceId(DFSUtil.java:1158)
at
org.apache.hadoop.hdfs.DFSUtil.getNamenodeNameServiceId(DFSUtil.java:1113)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.getNameServiceId(NameNode.java:1822)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:1005)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:995)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1769)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1834)
2023-10-30 15:53:24,870 INFO org.apache.hadoop.util.ExitUtil: Exiting with
status 1: org.apache.hadoop.HadoopIllegalArgumentException: Configuration has
multiple addresses that match local node's address. Please configure the system
with dfs.nameservice.id and dfs.ha.name
node.id
2023-10-30 15:53:24,874 INFO org.apache.hadoop.hdfs.server.namenode.NameNode:
SHUTDOWN_MSG: {code}
Solution
Add dfs.federation.router.ns.name configuration in hdfs-site.xml to mark the
Router NS name. and filter out Router NS during NameNode or ZKFC startup to
avoid this issue.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org