[ https://issues.apache.org/jira/browse/HDFS-17223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17801116#comment-17801116 ]
ASF GitHub Bot commented on HDFS-17223: --------------------------------------- gp1314 commented on PR #6183: URL: https://github.com/apache/hadoop/pull/6183#issuecomment-1871841096 Unfortunately, I didn't reproduce the problem. In the past, stopping a JN and restarting NN took a long time to initialize. I will pay more attention to the root cause of the problem. > Add journalnode maintenance node list > ------------------------------------- > > Key: HDFS-17223 > URL: https://issues.apache.org/jira/browse/HDFS-17223 > Project: Hadoop HDFS > Issue Type: Improvement > Components: qjm > Affects Versions: 3.3.6 > Reporter: kuper > Priority: Major > Labels: pull-request-available > > * In the case of configuring 3 journal nodes in HDFS, if only 2 journal nodes > are available and 1 journal node fails to start due to machine issues, it > will result in a long initialization time for the namenode (around 30-40 > minutes, depending on the IPC timeout and retry policy configuration). > * The failed journal node cannot recover immediately, but HDFS can still > function in this situation. In our production environment, we encountered > this issue and had to reduce the IPC timeout and adjust the retry policy to > accelerate the namenode initialization and provide services. > * I'm wondering if it would be possible to have a journal node maintenance > list to speed up the namenode initialization knowing that one journal node > cannot provide services in advance? -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org