[jira] [Commented] (HDFS-13183) Standby NameNode process getBlocks request to reduce Active load

Jim Brennan (Jira) Tue, 19 May 2020 15:06:32 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-13183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111578#comment-17111578
 ]


Jim Brennan commented on HDFS-13183:
------------------------------------

[~weichiu], [~hexiaoqiao], I believe this change is causing 
TestBalancerWithNodeGroup to fail: 
[https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/146/testReport/junit/org.apache.hadoop.hdfs.server.balancer/TestBalancerWithNodeGroup/testBalancerEndInNoMoveProgress/]

The problem is that Balancer.doBalance() was changed to construct the 
NameNodeConnectors inside the iteration loop.   The counter to track how many 
iterations we have gone without a move ({{notChangedIterations}}) is in the 
NameNodeConnector, but it is intended to work across iterations.  Since we are 
now creating new connectors on each iteration, this will always be zero, so we 
will never exit a balancer with ExitStatus.NO_MOVE_PROGRESS.

 

> Standby NameNode process getBlocks request to reduce Active load
> ----------------------------------------------------------------
>
>                 Key: HDFS-13183
>                 URL: https://issues.apache.org/jira/browse/HDFS-13183
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: balancer &amp; mover, namenode
>            Reporter: Xiaoqiao He
>            Assignee: Xiaoqiao He
>            Priority: Major
>             Fix For: 3.3.1, 3.4.0
>
>         Attachments: HDFS-13183-trunk.001.patch, HDFS-13183-trunk.002.patch, 
> HDFS-13183-trunk.003.patch, HDFS-13183.004.patch, HDFS-13183.005.patch, 
> HDFS-13183.006.patch, HDFS-13183.007.patch
>
>
> The performance of Active NameNode could be impact when {{Balancer}} requests 
> #getBlocks, since query blocks of overly full DNs performance is extremely 
> inefficient currently. The main reason is {{NameNodeRpcServer#getBlocks}} 
> hold read lock for long time. In extreme case, all handlers of Active 
> NameNode RPC server are occupied by one reader 
> {{NameNodeRpcServer#getBlocks}} and other write operation calls, thus Active 
> NameNode enter a state of false death for number of seconds even for minutes.
> The similar performance concerns of Balancer have reported by HDFS-9412, 
> HDFS-7967, etc.
> If Standby NameNode can shoulder #getBlocks heavy burden, it could speed up 
> the progress of balancing and reduce performance impact to Active NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13183) Standby NameNode process getBlocks request to reduce Active load

Reply via email to