[ https://issues.apache.org/jira/browse/HDFS-15589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17199870#comment-17199870 ]
Xiaoqiao He commented on HDFS-15589: ------------------------------------ Thanks [~zhengchenyu] for your report. Just wonder if any impact to NameNode when PMB(abbr. `PostponedMisreplicatedBlocks`) keeps large number for long time? The largest number of PMB near to 100M in my practice, and I do not meet any performance issue with my inner branch. Any issues do you meet? Thanks. > Huge PostponedMisreplicatedBlocks can't decrease immediately when start > namenode after datanode > ----------------------------------------------------------------------------------------------- > > Key: HDFS-15589 > URL: https://issues.apache.org/jira/browse/HDFS-15589 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs > Environment: CentOS 7 > Reporter: zhengchenyu > Priority: Major > > In our test cluster, I restart my namenode. Then I found many > PostponedMisreplicatedBlocks which doesn't decrease immediately. > I search the log below like this. > {code:java} > 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=c6a9934f-afd4-4437-b976-fed55173ce57, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=aee144f1-2082-4bca-a92b-f3c154a71c65, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=d152fa5b-1089-4bfc-b9c4-e3a7d98c7a7b, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,156 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=5cffc1fe-ace9-4af8-adfc-6002a7f5565d, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,161 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=9980d8e1-b0d9-4657-b97d-c803f82c1459, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,197 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=77ff3f5e-37f0-405f-a16c-166311546cae, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > {code} > Node: test cluster only have 6 datanode. > You will see the blockreport called before "Marking all datanodes as stale" > which is logged by startActiveServices. But > DatanodeStorageInfo.blockContentsStale only set to false in blockreport, then > startActiveServices set all datnaode to stale node. So the datanodes will > keep stale util next blockreport, then PostponedMisreplicatedBlocks keep a > huge number. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org