[ https://issues.apache.org/jira/browse/HDFS-9173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949985#comment-14949985 ]
Walter Su commented on HDFS-9173: --------------------------------- {noformat} blk_0 blk_1 blk_2 blk_3 blk_4 blk_5 blk_6 blk_7 blk_8 64k 64k 64k 64k 64k 64k 64k 64k 64k <-- stripe_0 64k 64k 64k 64k 64k 64k 64k 64k 64k 64k 64k 64k 64k 64k 64k 64k 61k <-- startStripeIdx 64k 64k 64k 64k 64k 64k 64k 64k 64k 64k 64k 64k 64k 59k 64k 64k 64k 64k 64k 64k 64k 64k 64k 64k 64k 64k <-- last full stripe 64k 64k 13k 64k 55k 3k <-- target last stripe 64k 64k 64k 1k 64k 64k 58k 64k 64k 64k 19k 64k <-- total visible stripe Due to different speed of streamers, the internal blocks in a block group could have different lengths when the block group isn't ended normally. The purpose of this class is to recover the UnderConstruction block group, so all internal blocks end at the same stripe. The steps: 1. get all blocks lengths from DataNodes. 2. calculate safe length, which is at the target last stripe. 3. decode and feed blk_6~8, make them end at last full stripe. (the last full stripe means the last decodable stripe.) 4. encode the target last stripe, with the remaining sequential data. In this case, the sequential data is 64k+64k+13k. Feed blk_6~8 the parity cells. Overwrite the parity cell if have to. 5. truncate the stripes from visible stripe, to target last stripe. {noformat} The step #4 requires overwrite parity blocks( If some parity block is the longest block), which is not supported by DataTransferProtocol.writeBlock(..). So in 01 patch the safe lengths ends at "last full stripe", but not "target last stripe". We probably will extend the protocol as suggested by [~jingzhao], ([link|https://issues.apache.org/jira/browse/HDFS-7663?focusedCommentId=14934206&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14934206] ) The step #4 can be done later after hflush is implemented. > Erasure Coding: Lease recovery for striped file > ----------------------------------------------- > > Key: HDFS-9173 > URL: https://issues.apache.org/jira/browse/HDFS-9173 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Walter Su > Assignee: Walter Su > Attachments: HDFS-9173.00.wip.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)