[ 
https://issues.apache.org/jira/browse/HDFS-6804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081972#comment-14081972
 ] 

Gordon Wang commented on HDFS-6804:
-----------------------------------

Some thoughts about how to fix this issue. In my mind, there are 2 ways to fix 
this.
* Option1
When the block is opened for appending, check if there are some DataTransfer 
threads which are transferring block to other DNs. Stop these DataTransferring 
threads. 
We can stop these threads because the generation timestamp of the block is 
increased because it is opened for appending. So, the DataTransfer threads are 
sending outdated blocks. 

* Option2
In DataTransfer thread, if the replica of the block is finalized, the 
DataTransfer thread can read the last data chunk checksum into the memory, 
record the replica length in memory too. Then, when sending the last data 
chunk,  use the checksum in memory instead of reading it from the disk. 
This is similar to what we deal with a RBW replica in DataTransfer.

For Option1, it is hard to stop the DataTransfer thread unless we add some code 
in DataNode to manage DataTransfer threads.
For Option2, we should lock FsDatasetImpl object in DataNode when reading the 
last chunk checksum from disk. Otherwise, the last block might be overwritten. 
But reading from the disk needs time, putting the expensive disk IO operations 
during locking FsDatasetImpl might cause some performance downgrade in 
DataNodes.

Any opinions or comments are welcome!
Thanks.

> race condition between transferring block and appending block causes 
> "Unexpected checksum mismatch exception" 
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6804
>                 URL: https://issues.apache.org/jira/browse/HDFS-6804
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.2.0
>            Reporter: Gordon Wang
>
> We found some error log in the datanode. like this
> {noformat}
> 2014-07-22 01:49:51,338 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Ex
> ception for BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248
> java.io.IOException: Terminating due to a checksum error.java.io.IOException: 
> Unexpected checksum mismatch while writing 
> BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248 from 
> /192.168.2.101:39495
>         at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:536)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:703)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:575)
>         at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:115)
>         at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
>         at java.lang.Thread.run(Thread.java:744)
> {noformat}
> While on the source datanode, the log says the block is transmitted.
> {noformat}
> 2014-07-22 01:49:50,805 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Da
> taTransfer: Transmitted 
> BP-2072804351-192.168.2.104-1406008383435:blk_1073741997
> _9248 (numBytes=16188152) to /192.168.2.103:50010
> {noformat}
> When the destination datanode gets the checksum mismatch, it reports bad 
> block to NameNode and NameNode marks the replica on the source datanode as 
> corrupt. But actually, the replica on the source datanode is valid. Because 
> the replica can pass the checksum verification.
> In all, the replica on the source data is wrongly marked as corrupted.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to