[ https://issues.apache.org/jira/browse/HDFS-5346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kihwal Lee updated HDFS-5346: ----------------------------- Description: When initial block reports are being processed, checkMode() is called from incrementSafeBlockCount(). This causes the replication queues to be initialized in the middle of processing a block report in the IBR processing mode. If there are many block reports waiting to be processed, SafeModeMonitor won't be able to make name node leave the safe mode soon. It appears that the block report processing speed degrades considerably during this time. Update: The main issue can be resolved by config. The other issue of calling getNumLiveDataNodes() for each block in the block report will be addressed in this jira was: When initial block reports are being processed, checkMode() is called from incrementSafeBlockCount(). This causes the replication queues to be initialized in the middle of processing a block report in the IBR processing mode. If there are many block reports waiting to be processed, SafeModeMonitor won't be able to make name node leave the safe mode soon. It appears that the block report processing speed degrades considerably during this time. Update: The main issue can be resolved by config. The other issue of calling > Avoid unnecessary call to getNumLiveDataNodes() for each block during IBR > processing > ------------------------------------------------------------------------------------ > > Key: HDFS-5346 > URL: https://issues.apache.org/jira/browse/HDFS-5346 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, performance > Affects Versions: 0.23.9, 2.3.0 > Reporter: Kihwal Lee > Assignee: Ravi Prakash > Fix For: 2.3.0, 0.23.10 > > Attachments: HDFS-5346.branch-23.patch, HDFS-5346.branch-23.patch, > HDFS-5346.patch, HDFS-5346.patch, HDFS-5346.patch > > > When initial block reports are being processed, checkMode() is called from > incrementSafeBlockCount(). This causes the replication queues to be > initialized in the middle of processing a block report in the IBR processing > mode. If there are many block reports waiting to be processed, > SafeModeMonitor won't be able to make name node leave the safe mode soon. It > appears that the block report processing speed degrades considerably during > this time. > Update: The main issue can be resolved by config. The other issue of calling > getNumLiveDataNodes() for each block in the block report will be addressed in > this jira -- This message was sent by Atlassian JIRA (v6.1#6144)