[
https://issues.apache.org/jira/browse/HBASE-28881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duo Zhang resolved HBASE-28881.
-------------------------------
Fix Version/s: 2.7.0
3.0.0-beta-2
2.6.4
2.5.13
(was: 4.0.0-alpha-1)
Hadoop Flags: Reviewed
Assignee: Ariadne_team
Resolution: Fixed
Pushed to all active branches.
Thanks [~ariadne]!
> Setting `hbase.master.procedure.threads` to negative value doesn't break
> HMaster but clients cannot connect
> -----------------------------------------------------------------------------------------------------------
>
> Key: HBASE-28881
> URL: https://issues.apache.org/jira/browse/HBASE-28881
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 2.4.2, 2.6.0, 3.0.0-beta-1
> Reporter: Ariadne_team
> Assignee: Ariadne_team
> Priority: Critical
> Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.4, 2.5.13
>
> Attachments: HBASE-28881-000.patch, HBASE-28881-001.patch
>
>
> ============================
> Problem
> -------------------------------------------------
> When we set 'hbase.master.procedure.threads' to a negative value as following:
> <property>
> <name>hbase.master.procedure.threads</name>
> <value>-1</value>
> </property>
> We found that HMaster starts normally, but the HBase client cannot connect to
> the server. Additionally, there are no related error messages in the HMaster
> logs, making it difficult for users to diagnose the root cause of the issue.
> The root cause may be in the following code:
> After 'hbase.master.procedure.threads' is parsed and loaded in
> createProcedureExecutor(), it will be propagated to init():
> {code:java}
> private void createProcedureExecutor() throws IOException {
> final int numThreads =
> conf.getInt(MasterProcedureConstants.MASTER_PROCEDURE_THREADS, Math.max(
> (cpus > 0 ? cpus / 4 : 0),
> MasterProcedureConstants.DEFAULT_MIN_MASTER_PROCEDURE_THREADS));
> ...
> procedureExecutor.init(numThreads, abortOnCorruption);
> } {code}
> In the {{init}} function, the parameter {{numThreads}} is used to initialize
> a series of work threads in a loop. However, since the configuration value is
> set to -1, the program does not enter the loop, resulting in no work threads
> being initialized. This leads to the client being unable to connect.
> {code:java}
> for (int i = 0; i < corePoolSize; ++i) {
> workerThreads.add(new WorkerThread(threadGroup));
> } {code}
> However, when this failure occurs, there are no error logs in the HMaster
> that explicitly point to this configuration parameter, making it difficult
> for users to diagnose the root cause.
> It is recommended that validation checks and corresponding log messages for
> this configuration parameter be added to assist users in diagnosing this
> issue.
>
> ============================
> Solution (the attached patch)
> -------------------------------------------------
> Since {{numThreads}} is declared as final in the {{createProcedureExecutor}}
> method and cannot be modified, it may be beneficial to add logging within
> that method to capture the configuration value.
> {code:java}
> @@ -1743,6 +1743,9 @@ public class HMaster extends
> HBaseServerBase<MasterRpcServices> implements Maste
> int cpus = Runtime.getRuntime().availableProcessors();
> final int numThreads =
> conf.getInt(MasterProcedureConstants.MASTER_PROCEDURE_THREADS, Math.max(
> (cpus > 0 ? cpus / 4 : 0),
> MasterProcedureConstants.DEFAULT_MIN_MASTER_PROCEDURE_THREADS));
> + if (numThreads <= 0) {
> + LOG.warn(MasterProcedureConstants.MASTER_PROCEDURE_THREADS + " is set
> to {}.", numThreads);
> + }
> final boolean abortOnCorruption =
> conf.getBoolean(MasterProcedureConstants.EXECUTOR_ABORT_ON_CORRUPTION,
> MasterProcedureConstants.DEFAULT_EXECUTOR_ABORT_ON_CORRUPTION);
> {code}
>
> These are the situations I encountered and possible mitigation solutions. If
> there is anything else you need to add, please remind me. Thank you.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)