[jira] [Updated] (HDFS-17356) RBF: Add Configuration dfs.federation.router.ns.name Optimization

2024-01-29 Thread xiaojunxiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojunxiang updated HDFS-17356:

Attachment: screenshot-5.png

> RBF: Add Configuration dfs.federation.router.ns.name Optimization
> -
>
> Key: HDFS-17356
> URL: https://issues.apache.org/jira/browse/HDFS-17356
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfs, rbf
>Reporter: wangzhihui
>Priority: Minor
> Attachments: image-2024-01-29-18-04-55-391.png, screenshot-1.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png, screenshot-5.png
>
>
>     When enabling RBF federation in HDFS, when the HDFS server and RBFClient 
> share the same configuration and the HDFS server (NameNode、ZKFC) and 
> RBFClient are on the same node, the following exception occurs, causing 
> NameNode to fail to start; The reason is that the NS of the Router service 
> has been added to the dfs.nameservices list. When NameNode starts, it obtains 
> the NS that the current node belongs to. However, it is found that there are 
> multiple NS that cannot be recognized and cannot pass the verification of 
> existing logic, ultimately resulting in NameNode startup failure. Currently, 
> we can only solve this problem by isolating the hdfs-site.xml of RouterClient 
> and NameNode. However, grouping configuration is not conducive to our unified 
> management of cluster configuration. Therefore, we propose a new solution to 
> solve this problem better.
> {code:java}
> // code placeholder
> 2023-10-30 15:53:24,613 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> registered UNIX signal handlers for [TERM, HUP, INT]
> 2023-10-30 15:53:24,672 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> createNameNode []
> 2023-10-30 15:53:24,760 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: 
> Loaded properties from hadoop-metrics2.properties
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric snapshot 
> period at 10 second(s).
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system 
> started
> 2023-10-30 15:53:24,868 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
> org.apache.hadoop.HadoopIllegalArgumentException: Configuration has multiple 
> addresses that match local node's address. Please configure the system with 
> dfs.nameservice.id and dfs.ha.namenode.id
>         at org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs(DFSUtil.java:1257)
>         at org.apache.hadoop.hdfs.DFSUtil.getNameServiceId(DFSUtil.java:1158)
>         at 
> org.apache.hadoop.hdfs.DFSUtil.getNamenodeNameServiceId(DFSUtil.java:1113)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getNameServiceId(NameNode.java:1822)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:1005)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:995)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1769)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1834)
> 2023-10-30 15:53:24,870 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: org.apache.hadoop.HadoopIllegalArgumentException: Configuration has 
> multiple addresses that match local node's address. Please configure the 
> system with dfs.nameservice.id and dfs.ha.name
> node.id
> 2023-10-30 15:53:24,874 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG: {code}
>  
> hdfs-site.xml
> {code:java}
> // code placeholder
> 
>   dfs.nameservices
>   mycluster1,mycluster2,ns-fed
> 
>   dfs.ha.namenodes.ns-fed
>   r1
> 
> 
>   dfs.namenode.rpc-address.ns-fed.r1
>   node1.com:
> 
> 
>   dfs.ha.namenodes.mycluster1
>   nn1,nn2
> 
> 
>   dfs.namenode.http-address.mycluster1.nn1
>   node1.com:50070
> 
> 
>   dfs.namenode.http-address.mycluster1.nn2
>   node2.com:50070
> 
>   dfs.ha.namenodes.mycluster2
>   nn1,nn2
> 
> 
>   dfs.namenode.http-address.mycluster2.nn1
>   node3.com:50070
> 
> 
>   dfs.namenode.http-address.mycluster2.nn2
>   node4.com:50070
> 
>   dfs.client.failover.proxy.provider.ns-fed
>   
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
> 
> 
>   dfs.client.failover.random.order
>   true
>  {code}
>  
> Solution
> Add dfs.federation.router.ns.name configuration in hdfs-site.xml to mark the 
> Router NS name. and filter out Router NS during NameNode or ZKFC startup to 
> avoid this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Updated] (HDFS-17356) RBF: Add Configuration dfs.federation.router.ns.name Optimization

2024-01-29 Thread xiaojunxiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojunxiang updated HDFS-17356:

Attachment: screenshot-4.png

> RBF: Add Configuration dfs.federation.router.ns.name Optimization
> -
>
> Key: HDFS-17356
> URL: https://issues.apache.org/jira/browse/HDFS-17356
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfs, rbf
>Reporter: wangzhihui
>Priority: Minor
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>
>     When enabling RBF federation in HDFS, when the HDFS server and RBFClient 
> share the same configuration and the HDFS server (NameNode、ZKFC) and 
> RBFClient are on the same node, the following exception occurs, causing 
> NameNode to fail to start; The reason is that the NS of the Router service 
> has been added to the dfs.nameservices list. When NameNode starts, it obtains 
> the NS that the current node belongs to. However, it is found that there are 
> multiple NS that cannot be recognized and cannot pass the verification of 
> existing logic, ultimately resulting in NameNode startup failure. Currently, 
> we can only solve this problem by isolating the hdfs-site.xml of RouterClient 
> and NameNode. However, grouping configuration is not conducive to our unified 
> management of cluster configuration. Therefore, we propose a new solution to 
> solve this problem better.
> {code:java}
> // code placeholder
> 2023-10-30 15:53:24,613 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> registered UNIX signal handlers for [TERM, HUP, INT]
> 2023-10-30 15:53:24,672 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> createNameNode []
> 2023-10-30 15:53:24,760 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: 
> Loaded properties from hadoop-metrics2.properties
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric snapshot 
> period at 10 second(s).
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system 
> started
> 2023-10-30 15:53:24,868 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
> org.apache.hadoop.HadoopIllegalArgumentException: Configuration has multiple 
> addresses that match local node's address. Please configure the system with 
> dfs.nameservice.id and dfs.ha.namenode.id
>         at org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs(DFSUtil.java:1257)
>         at org.apache.hadoop.hdfs.DFSUtil.getNameServiceId(DFSUtil.java:1158)
>         at 
> org.apache.hadoop.hdfs.DFSUtil.getNamenodeNameServiceId(DFSUtil.java:1113)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getNameServiceId(NameNode.java:1822)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:1005)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:995)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1769)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1834)
> 2023-10-30 15:53:24,870 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: org.apache.hadoop.HadoopIllegalArgumentException: Configuration has 
> multiple addresses that match local node's address. Please configure the 
> system with dfs.nameservice.id and dfs.ha.name
> node.id
> 2023-10-30 15:53:24,874 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG: {code}
>  
> hdfs-site.xml
> {code:java}
> // code placeholder
> 
>   dfs.nameservices
>   mycluster1,mycluster2,ns-fed
> 
>   dfs.ha.namenodes.ns-fed
>   r1
> 
> 
>   dfs.namenode.rpc-address.ns-fed.r1
>   node1.com:
> 
> 
>   dfs.ha.namenodes.mycluster1
>   nn1,nn2
> 
> 
>   dfs.namenode.http-address.mycluster1.nn1
>   node1.com:50070
> 
> 
>   dfs.namenode.http-address.mycluster1.nn2
>   node2.com:50070
> 
>   dfs.ha.namenodes.mycluster2
>   nn1,nn2
> 
> 
>   dfs.namenode.http-address.mycluster2.nn1
>   node3.com:50070
> 
> 
>   dfs.namenode.http-address.mycluster2.nn2
>   node4.com:50070
> 
>   dfs.client.failover.proxy.provider.ns-fed
>   
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
> 
> 
>   dfs.client.failover.random.order
>   true
>  {code}
>  
> Solution
> Add dfs.federation.router.ns.name configuration in hdfs-site.xml to mark the 
> Router NS name. and filter out Router NS during NameNode or ZKFC startup to 
> avoid this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17356) RBF: Add Configuration dfs.federation.router.ns.name Optimization

2024-01-29 Thread xiaojunxiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojunxiang updated HDFS-17356:

Attachment: screenshot-3.png

> RBF: Add Configuration dfs.federation.router.ns.name Optimization
> -
>
> Key: HDFS-17356
> URL: https://issues.apache.org/jira/browse/HDFS-17356
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfs, rbf
>Reporter: wangzhihui
>Priority: Minor
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>
>     When enabling RBF federation in HDFS, when the HDFS server and RBFClient 
> share the same configuration and the HDFS server (NameNode、ZKFC) and 
> RBFClient are on the same node, the following exception occurs, causing 
> NameNode to fail to start; The reason is that the NS of the Router service 
> has been added to the dfs.nameservices list. When NameNode starts, it obtains 
> the NS that the current node belongs to. However, it is found that there are 
> multiple NS that cannot be recognized and cannot pass the verification of 
> existing logic, ultimately resulting in NameNode startup failure. Currently, 
> we can only solve this problem by isolating the hdfs-site.xml of RouterClient 
> and NameNode. However, grouping configuration is not conducive to our unified 
> management of cluster configuration. Therefore, we propose a new solution to 
> solve this problem better.
> {code:java}
> // code placeholder
> 2023-10-30 15:53:24,613 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> registered UNIX signal handlers for [TERM, HUP, INT]
> 2023-10-30 15:53:24,672 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> createNameNode []
> 2023-10-30 15:53:24,760 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: 
> Loaded properties from hadoop-metrics2.properties
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric snapshot 
> period at 10 second(s).
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system 
> started
> 2023-10-30 15:53:24,868 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
> org.apache.hadoop.HadoopIllegalArgumentException: Configuration has multiple 
> addresses that match local node's address. Please configure the system with 
> dfs.nameservice.id and dfs.ha.namenode.id
>         at org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs(DFSUtil.java:1257)
>         at org.apache.hadoop.hdfs.DFSUtil.getNameServiceId(DFSUtil.java:1158)
>         at 
> org.apache.hadoop.hdfs.DFSUtil.getNamenodeNameServiceId(DFSUtil.java:1113)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getNameServiceId(NameNode.java:1822)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:1005)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:995)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1769)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1834)
> 2023-10-30 15:53:24,870 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: org.apache.hadoop.HadoopIllegalArgumentException: Configuration has 
> multiple addresses that match local node's address. Please configure the 
> system with dfs.nameservice.id and dfs.ha.name
> node.id
> 2023-10-30 15:53:24,874 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG: {code}
>  
> hdfs-site.xml
> {code:java}
> // code placeholder
> 
>   dfs.nameservices
>   mycluster1,mycluster2,ns-fed
> 
>   dfs.ha.namenodes.ns-fed
>   r1
> 
> 
>   dfs.namenode.rpc-address.ns-fed.r1
>   node1.com:
> 
> 
>   dfs.ha.namenodes.mycluster1
>   nn1,nn2
> 
> 
>   dfs.namenode.http-address.mycluster1.nn1
>   node1.com:50070
> 
> 
>   dfs.namenode.http-address.mycluster1.nn2
>   node2.com:50070
> 
>   dfs.ha.namenodes.mycluster2
>   nn1,nn2
> 
> 
>   dfs.namenode.http-address.mycluster2.nn1
>   node3.com:50070
> 
> 
>   dfs.namenode.http-address.mycluster2.nn2
>   node4.com:50070
> 
>   dfs.client.failover.proxy.provider.ns-fed
>   
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
> 
> 
>   dfs.client.failover.random.order
>   true
>  {code}
>  
> Solution
> Add dfs.federation.router.ns.name configuration in hdfs-site.xml to mark the 
> Router NS name. and filter out Router NS during NameNode or ZKFC startup to 
> avoid this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17356) RBF: Add Configuration dfs.federation.router.ns.name Optimization

2024-01-26 Thread xiaojunxiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojunxiang updated HDFS-17356:

Attachment: screenshot-2.png

> RBF: Add Configuration dfs.federation.router.ns.name Optimization
> -
>
> Key: HDFS-17356
> URL: https://issues.apache.org/jira/browse/HDFS-17356
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfs, rbf
>Reporter: wangzhihui
>Priority: Minor
> Attachments: screenshot-1.png, screenshot-2.png
>
>
>     When enabling RBF federation in HDFS, when the HDFS server and RBFClient 
> share the same configuration and the HDFS server (NameNode、ZKFC) and 
> RBFClient are on the same node, the following exception occurs, causing 
> NameNode to fail to start; The reason is that the NS of the Router service 
> has been added to the dfs.nameservices list. When NameNode starts, it obtains 
> the NS that the current node belongs to. However, it is found that there are 
> multiple NS that cannot be recognized and cannot pass the verification of 
> existing logic, ultimately resulting in NameNode startup failure. Currently, 
> we can only solve this problem by isolating the hdfs-site.xml of RouterClient 
> and NameNode. However, grouping configuration is not conducive to our unified 
> management of cluster configuration. Therefore, we propose a new solution to 
> solve this problem better.
> {code:java}
> // code placeholder
> 2023-10-30 15:53:24,613 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> registered UNIX signal handlers for [TERM, HUP, INT]
> 2023-10-30 15:53:24,672 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> createNameNode []
> 2023-10-30 15:53:24,760 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: 
> Loaded properties from hadoop-metrics2.properties
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric snapshot 
> period at 10 second(s).
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system 
> started
> 2023-10-30 15:53:24,868 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
> org.apache.hadoop.HadoopIllegalArgumentException: Configuration has multiple 
> addresses that match local node's address. Please configure the system with 
> dfs.nameservice.id and dfs.ha.namenode.id
>         at org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs(DFSUtil.java:1257)
>         at org.apache.hadoop.hdfs.DFSUtil.getNameServiceId(DFSUtil.java:1158)
>         at 
> org.apache.hadoop.hdfs.DFSUtil.getNamenodeNameServiceId(DFSUtil.java:1113)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getNameServiceId(NameNode.java:1822)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:1005)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:995)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1769)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1834)
> 2023-10-30 15:53:24,870 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: org.apache.hadoop.HadoopIllegalArgumentException: Configuration has 
> multiple addresses that match local node's address. Please configure the 
> system with dfs.nameservice.id and dfs.ha.name
> node.id
> 2023-10-30 15:53:24,874 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG: {code}
>  
> hdfs-site.xml
> {code:java}
> // code placeholder
> 
>   dfs.nameservices
>   mycluster1,mycluster2,ns-fed
> 
>   dfs.ha.namenodes.ns-fed
>   r1
> 
> 
>   dfs.namenode.rpc-address.ns-fed.r1
>   node1.com:
> 
> 
>   dfs.ha.namenodes.mycluster1
>   nn1,nn2
> 
> 
>   dfs.namenode.http-address.mycluster1.nn1
>   node1.com:50070
> 
> 
>   dfs.namenode.http-address.mycluster1.nn2
>   node2.com:50070
> 
>   dfs.ha.namenodes.mycluster2
>   nn1,nn2
> 
> 
>   dfs.namenode.http-address.mycluster2.nn1
>   node3.com:50070
> 
> 
>   dfs.namenode.http-address.mycluster2.nn2
>   node4.com:50070
> 
>   dfs.client.failover.proxy.provider.ns-fed
>   
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
> 
> 
>   dfs.client.failover.random.order
>   true
>  {code}
>  
> Solution
> Add dfs.federation.router.ns.name configuration in hdfs-site.xml to mark the 
> Router NS name. and filter out Router NS during NameNode or ZKFC startup to 
> avoid this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17356) RBF: Add Configuration dfs.federation.router.ns.name Optimization

2024-01-26 Thread xiaojunxiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojunxiang updated HDFS-17356:

Attachment: screenshot-1.png

> RBF: Add Configuration dfs.federation.router.ns.name Optimization
> -
>
> Key: HDFS-17356
> URL: https://issues.apache.org/jira/browse/HDFS-17356
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfs, rbf
>Reporter: wangzhihui
>Priority: Minor
> Attachments: screenshot-1.png
>
>
>     When enabling RBF federation in HDFS, when the HDFS server and RBFClient 
> share the same configuration and the HDFS server (NameNode、ZKFC) and 
> RBFClient are on the same node, the following exception occurs, causing 
> NameNode to fail to start; The reason is that the NS of the Router service 
> has been added to the dfs.nameservices list. When NameNode starts, it obtains 
> the NS that the current node belongs to. However, it is found that there are 
> multiple NS that cannot be recognized and cannot pass the verification of 
> existing logic, ultimately resulting in NameNode startup failure. Currently, 
> we can only solve this problem by isolating the hdfs-site.xml of RouterClient 
> and NameNode. However, grouping configuration is not conducive to our unified 
> management of cluster configuration. Therefore, we propose a new solution to 
> solve this problem better.
> {code:java}
> // code placeholder
> 2023-10-30 15:53:24,613 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> registered UNIX signal handlers for [TERM, HUP, INT]
> 2023-10-30 15:53:24,672 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> createNameNode []
> 2023-10-30 15:53:24,760 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: 
> Loaded properties from hadoop-metrics2.properties
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric snapshot 
> period at 10 second(s).
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system 
> started
> 2023-10-30 15:53:24,868 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
> org.apache.hadoop.HadoopIllegalArgumentException: Configuration has multiple 
> addresses that match local node's address. Please configure the system with 
> dfs.nameservice.id and dfs.ha.namenode.id
>         at org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs(DFSUtil.java:1257)
>         at org.apache.hadoop.hdfs.DFSUtil.getNameServiceId(DFSUtil.java:1158)
>         at 
> org.apache.hadoop.hdfs.DFSUtil.getNamenodeNameServiceId(DFSUtil.java:1113)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getNameServiceId(NameNode.java:1822)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:1005)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:995)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1769)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1834)
> 2023-10-30 15:53:24,870 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: org.apache.hadoop.HadoopIllegalArgumentException: Configuration has 
> multiple addresses that match local node's address. Please configure the 
> system with dfs.nameservice.id and dfs.ha.name
> node.id
> 2023-10-30 15:53:24,874 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG: {code}
>  
> hdfs-site.xml
> {code:java}
> // code placeholder
> 
>   dfs.nameservices
>   mycluster1,mycluster2,ns-fed
> 
>   dfs.ha.namenodes.ns-fed
>   r1
> 
> 
>   dfs.namenode.rpc-address.ns-fed.r1
>   node1.com:
> 
> 
>   dfs.ha.namenodes.mycluster1
>   nn1,nn2
> 
> 
>   dfs.namenode.http-address.mycluster1.nn1
>   node1.com:50070
> 
> 
>   dfs.namenode.http-address.mycluster1.nn2
>   node2.com:50070
> 
>   dfs.ha.namenodes.mycluster2
>   nn1,nn2
> 
> 
>   dfs.namenode.http-address.mycluster2.nn1
>   node3.com:50070
> 
> 
>   dfs.namenode.http-address.mycluster2.nn2
>   node4.com:50070
> 
>   dfs.client.failover.proxy.provider.ns-fed
>   
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
> 
> 
>   dfs.client.failover.random.order
>   true
>  {code}
>  
> Solution
> Add dfs.federation.router.ns.name configuration in hdfs-site.xml to mark the 
> Router NS name. and filter out Router NS during NameNode or ZKFC startup to 
> avoid this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17356) RBF: Add Configuration dfs.federation.router.ns.name Optimization

2024-01-26 Thread wangzhihui (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangzhihui updated HDFS-17356:
--
Description: 
    When enabling RBF federation in HDFS, when the HDFS server and RBFClient 
share the same configuration and the HDFS server (NameNode、ZKFC) and RBFClient 
are on the same node, the following exception occurs, causing NameNode to fail 
to start; The reason is that the NS of the Router service has been added to the 
dfs.nameservices list. When NameNode starts, it obtains the NS that the current 
node belongs to. However, it is found that there are multiple NS that cannot be 
recognized and cannot pass the verification of existing logic, ultimately 
resulting in NameNode startup failure. Currently, we can only solve this 
problem by isolating the hdfs-site.xml of RouterClient and NameNode. However, 
grouping configuration is not conducive to our unified management of cluster 
configuration. Therefore, we propose a new solution to solve this problem 
better.
{code:java}
// code placeholder
2023-10-30 15:53:24,613 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
registered UNIX signal handlers for [TERM, HUP, INT]
2023-10-30 15:53:24,672 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
createNameNode []
2023-10-30 15:53:24,760 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: 
Loaded properties from hadoop-metrics2.properties
2023-10-30 15:53:24,842 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
Scheduled Metric snapshot period at 10 second(s).
2023-10-30 15:53:24,842 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
NameNode metrics system started
2023-10-30 15:53:24,868 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: 
Failed to start namenode.
org.apache.hadoop.HadoopIllegalArgumentException: Configuration has multiple 
addresses that match local node's address. Please configure the system with 
dfs.nameservice.id and dfs.ha.namenode.id
        at org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs(DFSUtil.java:1257)
        at org.apache.hadoop.hdfs.DFSUtil.getNameServiceId(DFSUtil.java:1158)
        at 
org.apache.hadoop.hdfs.DFSUtil.getNamenodeNameServiceId(DFSUtil.java:1113)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.getNameServiceId(NameNode.java:1822)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:1005)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:995)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1769)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1834)
2023-10-30 15:53:24,870 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
status 1: org.apache.hadoop.HadoopIllegalArgumentException: Configuration has 
multiple addresses that match local node's address. Please configure the system 
with dfs.nameservice.id and dfs.ha.name
node.id
2023-10-30 15:53:24,874 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
SHUTDOWN_MSG: {code}
 

hdfs-site.xml
{code:java}
// code placeholder


  dfs.nameservices
  mycluster1,mycluster2,ns-fed

  dfs.ha.namenodes.ns-fed
  r1


  dfs.namenode.rpc-address.ns-fed.r1
  node1.com:


  dfs.ha.namenodes.mycluster1
  nn1,nn2


  dfs.namenode.http-address.mycluster1.nn1
  node1.com:50070


  dfs.namenode.http-address.mycluster1.nn2
  node2.com:50070

  dfs.ha.namenodes.mycluster2
  nn1,nn2


  dfs.namenode.http-address.mycluster2.nn1
  node3.com:50070


  dfs.namenode.http-address.mycluster2.nn2
  node4.com:50070

  dfs.client.failover.proxy.provider.ns-fed
  
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider


  dfs.client.failover.random.order
  true
 {code}
 

Solution

Add dfs.federation.router.ns.name configuration in hdfs-site.xml to mark the 
Router NS name. and filter out Router NS during NameNode or ZKFC startup to 
avoid this issue.

  was:
    When enabling RBF federation in HDFS, when the HDFS server and RBFClient 
share the same configuration and the HDFS server (NameNode、ZKFC) and RBFClient 
are on the same node, the following exception occurs, causing NameNode to fail 
to start; The reason is that the NS of the Router service has been added to the 
dfs.nameservices list. When NameNode starts, it obtains the NS that the current 
node belongs to. However, it is found that there are multiple NS that cannot be 
recognized and cannot pass the verification of existing logic, ultimately 
resulting in NameNode startup failure. Currently, we can only solve this 
problem by isolating the hdfs-site.xml of RouterClient and NameNode. However, 
grouping configuration is not conducive to our unified management of cluster 
configuration. Therefore, we propose a new solution to solve this problem 
better.
{code:java}
// code placeholder
2023-10-30 15:53:24,613 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
registered UNIX signal handlers for [TERM, HUP, 

[jira] [Updated] (HDFS-17356) RBF: Add Configuration dfs.federation.router.ns.name Optimization

2024-01-26 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-17356:
---
Summary: RBF: Add Configuration dfs.federation.router.ns.name Optimization  
(was: RBF Add Configuration dfs.federation.router.ns.name Optimization)

> RBF: Add Configuration dfs.federation.router.ns.name Optimization
> -
>
> Key: HDFS-17356
> URL: https://issues.apache.org/jira/browse/HDFS-17356
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfs, rbf
>Reporter: wangzhihui
>Priority: Minor
>
>     When enabling RBF federation in HDFS, when the HDFS server and RBFClient 
> share the same configuration and the HDFS server (NameNode、ZKFC) and 
> RBFClient are on the same node, the following exception occurs, causing 
> NameNode to fail to start; The reason is that the NS of the Router service 
> has been added to the dfs.nameservices list. When NameNode starts, it obtains 
> the NS that the current node belongs to. However, it is found that there are 
> multiple NS that cannot be recognized and cannot pass the verification of 
> existing logic, ultimately resulting in NameNode startup failure. Currently, 
> we can only solve this problem by isolating the hdfs-site.xml of RouterClient 
> and NameNode. However, grouping configuration is not conducive to our unified 
> management of cluster configuration. Therefore, we propose a new solution to 
> solve this problem better.
> {code:java}
> // code placeholder
> 2023-10-30 15:53:24,613 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> registered UNIX signal handlers for [TERM, HUP, INT]
> 2023-10-30 15:53:24,672 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> createNameNode []
> 2023-10-30 15:53:24,760 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: 
> Loaded properties from hadoop-metrics2.properties
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric snapshot 
> period at 10 second(s).
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system 
> started
> 2023-10-30 15:53:24,868 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
> org.apache.hadoop.HadoopIllegalArgumentException: Configuration has multiple 
> addresses that match local node's address. Please configure the system with 
> dfs.nameservice.id and dfs.ha.namenode.id
>         at org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs(DFSUtil.java:1257)
>         at org.apache.hadoop.hdfs.DFSUtil.getNameServiceId(DFSUtil.java:1158)
>         at 
> org.apache.hadoop.hdfs.DFSUtil.getNamenodeNameServiceId(DFSUtil.java:1113)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getNameServiceId(NameNode.java:1822)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:1005)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:995)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1769)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1834)
> 2023-10-30 15:53:24,870 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: org.apache.hadoop.HadoopIllegalArgumentException: Configuration has 
> multiple addresses that match local node's address. Please configure the 
> system with dfs.nameservice.id and dfs.ha.name
> node.id
> 2023-10-30 15:53:24,874 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG: {code}
> Solution 
> Add dfs.federation.router.ns.name configuration in hdfs-site.xml to mark the 
> Router NS name. and filter out Router NS during NameNode or ZKFC startup to 
> avoid this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17356) RBF Add Configuration dfs.federation.router.ns.name Optimization

2024-01-26 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-17356:
---
Fix Version/s: (was: 3.3.0)

> RBF Add Configuration dfs.federation.router.ns.name Optimization
> 
>
> Key: HDFS-17356
> URL: https://issues.apache.org/jira/browse/HDFS-17356
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfs, rbf
>Reporter: wangzhihui
>Priority: Minor
>
>     When enabling RBF federation in HDFS, when the HDFS server and RBFClient 
> share the same configuration and the HDFS server (NameNode、ZKFC) and 
> RBFClient are on the same node, the following exception occurs, causing 
> NameNode to fail to start; The reason is that the NS of the Router service 
> has been added to the dfs.nameservices list. When NameNode starts, it obtains 
> the NS that the current node belongs to. However, it is found that there are 
> multiple NS that cannot be recognized and cannot pass the verification of 
> existing logic, ultimately resulting in NameNode startup failure. Currently, 
> we can only solve this problem by isolating the hdfs-site.xml of RouterClient 
> and NameNode. However, grouping configuration is not conducive to our unified 
> management of cluster configuration. Therefore, we propose a new solution to 
> solve this problem better.
> {code:java}
> // code placeholder
> 2023-10-30 15:53:24,613 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> registered UNIX signal handlers for [TERM, HUP, INT]
> 2023-10-30 15:53:24,672 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> createNameNode []
> 2023-10-30 15:53:24,760 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: 
> Loaded properties from hadoop-metrics2.properties
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric snapshot 
> period at 10 second(s).
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system 
> started
> 2023-10-30 15:53:24,868 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
> org.apache.hadoop.HadoopIllegalArgumentException: Configuration has multiple 
> addresses that match local node's address. Please configure the system with 
> dfs.nameservice.id and dfs.ha.namenode.id
>         at org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs(DFSUtil.java:1257)
>         at org.apache.hadoop.hdfs.DFSUtil.getNameServiceId(DFSUtil.java:1158)
>         at 
> org.apache.hadoop.hdfs.DFSUtil.getNamenodeNameServiceId(DFSUtil.java:1113)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getNameServiceId(NameNode.java:1822)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:1005)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:995)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1769)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1834)
> 2023-10-30 15:53:24,870 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: org.apache.hadoop.HadoopIllegalArgumentException: Configuration has 
> multiple addresses that match local node's address. Please configure the 
> system with dfs.nameservice.id and dfs.ha.name
> node.id
> 2023-10-30 15:53:24,874 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG: {code}
> Solution 
> Add dfs.federation.router.ns.name configuration in hdfs-site.xml to mark the 
> Router NS name. and filter out Router NS during NameNode or ZKFC startup to 
> avoid this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org