[
https://issues.apache.org/jira/browse/HDFS-17106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18029135#comment-18029135
]
ASF GitHub Bot commented on HDFS-17106:
---------------------------------------
github-actions[bot] commented on PR #6090:
URL: https://github.com/apache/hadoop/pull/6090#issuecomment-3392652819
We're closing this stale PR because it has been open for 100 days with no
activity. This isn't a judgement on the merit of the PR in any way. It's just a
way of keeping the PR queue manageable.
If you feel like this was a mistake, or you would like to continue working
on it, please feel free to re-open it and ask for a committer to remove the
stale tag and review again.
Thanks all for your contribution.
> NameNode can timeout during initialization with
> dfs.datanode.volumes.replica-add.threadpool.size being 0
> --------------------------------------------------------------------------------------------------------
>
> Key: HDFS-17106
> URL: https://issues.apache.org/jira/browse/HDFS-17106
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: ConfX
> Priority: Critical
> Labels: pull-request-available
> Attachments: reproduce.sh
>
>
> h2. What happened:
> When setting {{dfs.datanode.volumes.replica-add.threadpool.size}} to 0, HDFS
> cluster is never able to start and gets timed out eventually.
> h2. Buggy code:
> Still investigating
> h2. StackTrace:
> {noformat}
> java.io.IOException: Timed out waiting for Mini HDFS Cluster to start
> at
> org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1503)
> at
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:973)
> at org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:576)
> at
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:518)
> at
> org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics.setUp(TestNameNodeMetrics.java:166){noformat}
> h2. Reproduce:
> (1) Set {{dfs.datanode.volumes.replica-add.threadpool.size}} to 0
> (2) Run a simple test that exercises this parameter, e.g.
> {{org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics#testExcessBlocks}}
> h2. Solution:
> If 0 is not a valid value, then it would be good to have a checker.
> For example like {{{}file.bytes-per-checksum{}}}, it has a checker to check
> it must be larger than 0.
> {noformat}
> Preconditions.checkState(bytesPerChecksum > 0, "bytes per checksum should be
> positive but was %s", bytesPerChecksum);{noformat}
> We could have similar checker
> {noformat}
> Preconditions.checkState(addReplicaThreadPool > 0, "addReplicaThreadPool
> should be positive but was %s", addReplicaThreadPool);{noformat}
> For an easy reproduction, run the reproduce.sh in the attachment. We are
> happy to provide a patch if this issue is confirmed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]