[ https://issues.apache.org/jira/browse/HDFS-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Uma Maheswara Rao G updated HDFS-9381: -------------------------------------- Attachment: HDFS-9381.00.patch Attached the simple patch which shows the fix proposed. > When same block came for replication for Striped mode, we can move that block > to PendingReplications > ---------------------------------------------------------------------------------------------------- > > Key: HDFS-9381 > URL: https://issues.apache.org/jira/browse/HDFS-9381 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, namenode > Affects Versions: 3.0.0 > Reporter: Uma Maheswara Rao G > Assignee: Uma Maheswara Rao G > Attachments: HDFS-9381.00.patch > > > Currently I noticed that we are just returning null if block already exists > in pendingReplications in replication flow for striped blocks. > {code} > if (block.isStriped()) { > if (pendingNum > 0) { > // Wait the previous recovery to finish. > return null; > } > {code} > Here if we just return null and if neededReplications contains only fewer > blocks(basically by default if less than numliveNodes*2), then same blocks > can be picked again from neededReplications from next loop as we are not > removing element from neededReplications. Since this replication process need > to take fsnamesystmem lock and do, we may spend some time unnecessarily in > every loop. > So my suggestion/improvement is: > Instead of just returning null, how about incrementing pendingReplications > for this block and remove from neededReplications? and also another point to > consider here is, to add into pendingReplications, generally we need target > and it is nothing but to which node we issued replication command. Later when > after replication success and DN reported it, block will be removed from > pendingReplications from NN addBlock. > So since this is newly picked block from neededReplications, we would not > have selected target yet. So which target to be passed to pendingReplications > if we add this block? One Option I am thinking is, how about just passing > srcNode itself as target for this special condition? So, anyway if the block > is really missed, srcNode will not report it. So this block will not be > removed from pending replications, so that when it is timed out, it will be > considered for replication again and that time it will find actual target to > replicate while processing as part of regular replication flow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)