[ 
https://issues.apache.org/jira/browse/HDFS-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14089967#comment-14089967
 ] 

Ming Ma commented on HDFS-6425:
-------------------------------

Thanks, Arpit. This jira can address more common NN failover scenario with lots 
of "content stale" storages.

We try to get storages out of "content stale"  as soon as possible. Here are 
several scenarios.

a. For non-HA NN restart, have DN send HB before BR right after registration.
b. For HA setup, NN becomes active right after it restarts. This can happen if 
we have to restart both NNs at the same time, due to some rare outage or some 
incompatible upgrade. In this case, the active NN will first go to standby, 
then get transitioned to active at which point all DNs will be marked as stale 
again. For big clusters, most of the DN reregistration will come in after the 
NN becomes active, so the fix to have DNs send HB and BR right after 
registration will also help.
c. For HA setup, NN becomes active after the NN JVM has been up for some time. 
The failover could happen due to zk session timeout, or the other NN just 
crashes. In this case, there is no DN reregistration given the new active NN 
doesn't have recent restart. We can change the NN to ask DN to resend 
blockreport upon failover, but that will cause cluster performance issue.

So we still have some scenario where we might have lots of "content stale" 
storages. This jira tries to make NN handle the scenario better.


> Large postponedMisreplicatedBlocks has impact on blockReport latency
> --------------------------------------------------------------------
>
>                 Key: HDFS-6425
>                 URL: https://issues.apache.org/jira/browse/HDFS-6425
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HDFS-6425-Test-Case.pdf, HDFS-6425.patch
>
>
> Sometimes we have large number of over replicates when NN fails over. When 
> the new active NN took over, over replicated blocks will be put to 
> postponedMisreplicatedBlocks until all DNs for that block aren't stale 
> anymore.
> We have a case where NNs flip flop. Before postponedMisreplicatedBlocks 
> became empty, NN fail over again and again. So postponedMisreplicatedBlocks 
> just kept increasing until the cluster is stable. 
> In addition, large postponedMisreplicatedBlocks could make 
> rescanPostponedMisreplicatedBlocks slow. rescanPostponedMisreplicatedBlocks 
> takes write lock. So it could slow down the block report processing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to