[ https://issues.apache.org/jira/browse/HDFS-17223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786000#comment-17786000 ]
ASF GitHub Bot commented on HDFS-17223: --------------------------------------- xinglin commented on code in PR #6183: URL: https://github.com/apache/hadoop/pull/6183#discussion_r1392945651 ########## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/client/QuorumJournalManager.java: ########## @@ -667,6 +700,9 @@ AsyncLoggerSet getLoggerSetForTests() { @Override public void doPreUpgrade() throws IOException { + if (isEnableJnMaintenance()) { + throw new IOException("doPreUpgrade() does not support enabling jn maintenance mode"); Review Comment: nit: doPreUpgrade() does not support enabling jn maintenance mode -> doPreUpgrade() is not support while in jn maintenance mode ########## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/client/QuorumJournalManager.java: ########## @@ -684,6 +720,9 @@ public void doPreUpgrade() throws IOException { @Override public void doUpgrade(Storage storage) throws IOException { + if (isEnableJnMaintenance()) { + throw new IOException("doUpgrade() does not support enabling jn maintenance mode"); Review Comment: same here. ########## hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml: ########## @@ -6333,6 +6333,20 @@ </description> </property> + <property> + <name>dfs.journalnode.maintenance.nodes</name> + <value></value> + <description> + In the case of one out of three journal nodes being down, theoretically the service can still + continue. However, in reality, the downed node may not recover quickly. If the Namenode needs + to be restarted, it will try the downed journal node through the lengthy RPC retry mechanism, + resulting in a long initialization time for the Namenode to provide services. By adding the + downed journal node to the maintenance nodes, the initialization time of the Namenode in such + scenarios can be accelerated. Review Comment: nit -> In the case that one out of three journal nodes being down, theoretically HDFS can still function. However, in reality, the unavailable journal node may not recover quickly. During this period, when we need to restart an Namenode, the Namenode will try to connect to the unavailable journal node through the lengthy RPC retry mechanism, resulting in a long initialization time for the Namenode. By adding these unavailable journal nodes to the maintenance nodes, we will skip these unavailable journal nodes during Namenode initialization and thus reduce namenode startup time. 1-node example values: <> 2-node example values: <> ########## hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUtil.java: ########## @@ -1137,4 +1139,24 @@ public void testAddTransferRateMetricForInvalidValue() { DFSUtil.addTransferRateMetric(mockMetrics, 100, 0); verify(mockMetrics, times(0)).addReadTransferRate(anyLong()); } + + @Test + public void testGetHostSet() { + String[] testAddrs = new String[] {NS1_NN_ADDR, NS1_NN1_ADDR}; Review Comment: this test case is a bit confusing. Can we just use "unreachable-host1.com:9000" instead? ########## hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/client/TestQuorumJournalManager.java: ########## @@ -1171,6 +1176,21 @@ public void testSelectViaRpcAfterJNRestart() throws Exception { } } + /** + * Tests to throw an exception if the jn maintenance nodes exceeds half of the journalnode number. + */ + @Test + public void testJNMaintenanceListViaRpcTwoJNsError() throws Exception { + StringJoiner maintenanceListBuff = new StringJoiner(","); + for (int i = 0; i < 2; i++) { + maintenanceListBuff.add( + NetUtils.getHostPortString(cluster.getJournalNode(i).getBoundIpcAddress())); + } + this.conf.set(DFS_JOURNALNODE_MAINTENANCE_NODES_KEY, maintenanceListBuff.toString()); + assertThrows(IllegalArgumentException.class, () -> createSpyingQJM()); + } + + Review Comment: I assume we also need a test case to to verify maintenance journal nodes are indeed excluded? total JN=3, with 1 is excluded. after we initialize, verify only 2 JNs are included. ########## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/client/QuorumJournalManager.java: ########## @@ -701,6 +740,9 @@ public void doUpgrade(Storage storage) throws IOException { @Override public void doFinalize() throws IOException { + if (isEnableJnMaintenance()) { + throw new IOException("doFinalize() does not support enabling jn maintenance mode"); Review Comment: same here. and for others functions. > Add journalnode maintenance node list > ------------------------------------- > > Key: HDFS-17223 > URL: https://issues.apache.org/jira/browse/HDFS-17223 > Project: Hadoop HDFS > Issue Type: Improvement > Components: qjm > Affects Versions: 3.3.6 > Reporter: kuper > Priority: Major > Labels: pull-request-available > > * In the case of configuring 3 journal nodes in HDFS, if only 2 journal nodes > are available and 1 journal node fails to start due to machine issues, it > will result in a long initialization time for the namenode (around 30-40 > minutes, depending on the IPC timeout and retry policy configuration). > * The failed journal node cannot recover immediately, but HDFS can still > function in this situation. In our production environment, we encountered > this issue and had to reduce the IPC timeout and adjust the retry policy to > accelerate the namenode initialization and provide services. > * I'm wondering if it would be possible to have a journal node maintenance > list to speed up the namenode initialization knowing that one journal node > cannot provide services in advance? -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org