[ https://issues.apache.org/jira/browse/HDFS-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764913#action_12764913 ]
Hairong Kuang commented on HDFS-679: ------------------------------------ The proposal in above comment is a simple solution, but it has an unresolvable flaw. The client needs READ permission to read the partial chunk which a writer does not require to have. So instead of using a different solution, I focus on making the existing solution to work. Here is the plan: 1. The client does not set appendChunk to be false until the first chunk has been sent; 2. If datanode receives a request to append "bcd" to a partial chunk "a" but "bc" have already been written to disk in a previous hflush, the datanode will read "abc" from disk and computes the checksum of "abcd" and then write "c" and the new checksum to the disk. In the current trunk and 0.21, the datanode mistakenly computes the crc of "abcbcd". I will also implement the proposed optimization: a block receiver at the datanode is not to overwrite block file if a packet starts with data that the replica file already has. For the crc file, only the last 4 bytes are allowed to be overwritten. This optimization makes it easy to implement datanode side fix. > Block receiver unexpectedly throws an IOException complaining mismatched > checksum > --------------------------------------------------------------------------------- > > Key: HDFS-679 > URL: https://issues.apache.org/jira/browse/HDFS-679 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 0.21.0 > Reporter: Hairong Kuang > Assignee: Hairong Kuang > Fix For: 0.21.0 > > > When I run TestClientProtocolForPipelineRecovery, I always see that the block > receiver throws IOException complaining about mismatched checksum when > receiving the last data packet. It turns out the checksum of last packet was > unexpectedly set to be zero. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.