[ https://issues.apache.org/jira/browse/HDFS-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17738581#comment-17738581 ]
Stephen O'Donnell commented on HDFS-17061: ------------------------------------------ I don't know of any tool, and certainly the datanode does not know if the block is a data or parity. You might be able to do some analysis from fsck output, but I have never tried to do it for this EC analysis. > EC: Let data blocks and parity blocks on DNs more balanced > ---------------------------------------------------------- > > Key: HDFS-17061 > URL: https://issues.apache.org/jira/browse/HDFS-17061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover, erasure-coding, hdfs > Reporter: WangYuanben > Priority: Minor > Attachments: figure1, unbalanced traffic load on DNs.png, figure2, > balanced traffic load on DNs.png > > > When choosing DN for placing data block or parity block, the existing number > of data block and parity block on datanode is not taken into consideration. > This may lead to *uneven traffic load*. > As shown in the figure 1, when reading block group A, B, C, D and E from five > different EC files without any missing block, datanodes like DN1 and DN2 will > have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low > or even no traffic load. > !figure1, unbalanced traffic load on DNs.png|width=600,height=333! > +If we can let data blocks and parity blocks on DNs more balanced, the > traffic load in cluster will be more balanced and the peak traffic load on DN > will be reduced+. Here "balance" refers to the matching of the number of data > blocks and parity blocks on DN with its EC policy. In the ideal state, each > DN has a balanced traffic load just like what figure 2 shows. > !figure2, balanced traffic load on DNs.png|width=600,height=333! > Then how to reduce this imbalance? I think it's related to EC policy and the > ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's > appropriate to let the ratio close to 3:2. > There are two solutions: > 1.Improve the block placement policy. > 2.Improve the Balancer. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org