caozhiqiang created HDFS-17818:
----------------------------------
Summary: Fix serial fsimage transfer during checkpoint with
multiple namenodes
Key: HDFS-17818
URL: https://issues.apache.org/jira/browse/HDFS-17818
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Affects Versions: 3.5.0
Reporter: caozhiqiang
Assignee: caozhiqiang
In our cluster, each namespace has four NameNodes: one active, one standby, and
two observers. When the standby NameNode performs a checkpoint, it transfer the
fsimage to the other three NameNodes. However, we found that these transfer are
performed serially.
The reason is that the corePoolSize in ThreadPoolExecutor is 0, and the
transfer task does not fill the LinkedBlockingQueue, resulting in only one
thread transfer the fsimage at a time. This greatly increases the checkpoint
time.
{code:java}
ExecutorService executor = new ThreadPoolExecutor(0,
activeNNAddresses.size(), 100,
TimeUnit.MILLISECONDS, new
LinkedBlockingQueue<Runnable>(activeNNAddresses.size()),
uploadThreadFactory); {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]