[ https://issues.apache.org/jira/browse/HDFS-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arpit Agarwal updated HDFS-5153: -------------------------------- Description: When the number of blocks on the DataNode grows large we start running into a few issues: # Block reports take a long time to process on the NameNode. In testing we have seen that a block report with 6 Million blocks takes close to one second to process on the NameNode. The NameSystem write lock is held during this time. # We start hitting the default protobuf message limit of 64MB somewhere around 10 Million blocks. While we can increase the message size limit it already takes over 7 seconds to serialize/unserialize a block report of this size. HDFS-2832 has introduced the concept of a DataNode as a collection of storages i.e. the NameNode is aware of all the volumes attached to a given DataNode. this makes it easy to split block reports from the DN by sending one report per attached storage. was: When the number of blocks on the DataNode grows large we start running into a few issues: # Block reports take a long time to process on the NameNode. In testing we have seen that a block report with 6 Million blocks takes close to one second to process on the NameNode. The NameSystem write lock is held during this time. # We start hitting the default protobuf message limit of 64MB somewhere around 10 Million blocks. While we can increase the message size limit it already takes over 7 seconds to serialize/unserialize a block report of this size. HDFS-2832 has introduced the concept of a DataNode as a collection of storages i.e. we can > Datanode should stagger block reports from individual storages > -------------------------------------------------------------- > > Key: HDFS-5153 > URL: https://issues.apache.org/jira/browse/HDFS-5153 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode > Affects Versions: 3.0.0 > Reporter: Arpit Agarwal > > When the number of blocks on the DataNode grows large we start running into a > few issues: > # Block reports take a long time to process on the NameNode. In testing we > have seen that a block report with 6 Million blocks takes close to one second > to process on the NameNode. The NameSystem write lock is held during this > time. > # We start hitting the default protobuf message limit of 64MB somewhere > around 10 Million blocks. While we can increase the message size limit it > already takes over 7 seconds to serialize/unserialize a block report of this > size. > HDFS-2832 has introduced the concept of a DataNode as a collection of > storages i.e. the NameNode is aware of all the volumes attached to a given > DataNode. this makes it easy to split block reports from the DN by sending > one report per attached storage. -- This message was sent by Atlassian JIRA (v6.1.5#6160)