[ https://issues.apache.org/jira/browse/HDFS-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14791739#comment-14791739 ]
Walter Su commented on HDFS-9040: --------------------------------- bq. should we just do updatePipeline when completing the block? 1. In the read-being-written scenario, there will be a longer window of *false-fresh" (meaning a stale internal block is considered as fresh). We should do it before hflush/hsync as well. bq. 2. When NUM_PARITY_BLOCKS number of streamers are dead, the OutputStream should die immediately instead of waiting for the next writeChunk. failed streamer is detected in writeChunk. We plan to add periodical checking. [~jingzhao] said that before. bq. 3. We might want to add the logic to replace a failed StripedDataStreamer in the future. No, we won't. I think so? if you're talking something like Datanode replacement for repl block. You can transfer a healthy repl RBW to a new Datanode, then you still get 3 DNs after replacement. But recover a corrupted RBW internal block is difficult. I've a question. Instead of delay, Do we even need refresh UC.replicas? 1. A client read UC block being written can decode replica if it misses some part. ( With checksum verification, we are only concern about 'missing') 2. Block recovery/ lease recovery truncates all RBW's length to minimal length for repl block. For striping, Assume a corrupted internalBlock has a small length ,like 200kb. 8 healthy internalBlocks have long length, like (1mb-cellSize, 1mb+cellSize). Of course after recovery we should truncate the 8 to 1mb ( 8 healthy internal blocks should be at the same last stripe, but should we truncate last stripe? That's not my point.). My point is , we can rule out the corrupted internalBlocks by {{commitBlockSynchronization}}. 3. Maintenance the indices of UC.replicas. UC.replicas updated by BlockReport is safe, because reportedBlock has ID. If UC.replicas is updated by updatePipeline, the indices are derived from array offset. You can see {{UC.setExpectedLocations()}} It's error prone. If we don't refresh UC.replicas we are pretty safe. > Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests > to Coordinator) > ------------------------------------------------------------------------------------------- > > Key: HDFS-9040 > URL: https://issues.apache.org/jira/browse/HDFS-9040 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Walter Su > Attachments: HDFS-9040-HDFS-7285.002.patch, > HDFS-9040-HDFS-7285.003.patch, HDFS-9040.00.patch, HDFS-9040.001.wip.patch, > HDFS-9040.02.bgstreamer.patch > > > The general idea is to simplify error handling logic. > Proposal 1: > A BlockGroupDataStreamer to communicate with NN to allocate/update block, and > StripedDataStreamer s only have to stream blocks to DNs. > Proposal 2: > See below the > [comment|https://issues.apache.org/jira/browse/HDFS-9040?focusedCommentId=14741388&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14741388] > from [~jingzhao]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)