[ 
https://issues.apache.org/jira/browse/HDFS-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thanh Do updated HDFS-1227:
---------------------------

    Description: 
- Summary: client append is not atomic, hence, it is possible that
when retrying during append, there is an exception in updateBlock
indicating unmatched file length, making append failed.
 
- Setup:
+ # available datanodes = 3
+ # disks / datanode = 1
+ # failures = 1
+ failure type = bad disk
+ When/where failure happens = (see below)
+ This bug is non-deterministic, to reproduce it, add a sufficient sleep before 
out.write() in BlockReceiver.receivePacket() in dn1 and dn2 but not dn3
 
- Details:
 Suppose client appends 16 bytes to block X which has length 16 bytes at dn1, 
dn2, dn3.
Dn1 is primary. The pipeline is dn3-dn2-dn1. recoverBlock succeeds.
Client starts sending data to the dn3 - the first datanode in pipeline.
dn3 forwards the packet to downstream datanodes, and starts writing
data to its disk. Suppose there is an exception in dn3 when writing to disk.
Client gets the exception, it starts the recovery code by calling 
dn1.recoverBlock() again.
dn1 in turn calls dn2.getMetadataInfo() and dn1.getMetaDataInfo() to build the 
syncList.
Suppose at the time getMetadataInfo() is called at both datanodes (dn1 and dn2),
the previous packet (which is sent from dn3) has not come to disk yet.
Hence, the block Info given by getMetaDataInfo contains the length of 16 bytes.
But after that, the packet "comes" to disk, making the block file length now 
becomes 32 bytes.
Using the syncList (with contains block info with length 16 byte), dn1 calls 
updateBlock at
dn2 and dn1, which will failed, because the length of new block info (given by 
updateBlock,
which is 16 byte) does not match with its actual length on disk (which is 32 
byte)
 
Note that this bug is non-deterministic. Its depends on the thread interleaving
at datanodes.

This bug was found by our Failure Testing Service framework:
http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
For questions, please email us: Thanh Do (than...@cs.wisc.edu) and 
Haryadi Gunawi (hary...@eecs.berkeley.edu)



  was:
- Summary: client append is not atomic, hence, it is possible that
when retrying during append, there is an exception in updateBlock
indicating unmatched file length, making append failed.
 
- Setup:
+ # available datanodes = 3
+ # disks / datanode = 1
+ # failures = 2
+ failure type = bad disk
+ When/where failure happens = (see below)
+ This bug is non-deterministic, to reproduce it, add a sufficient sleep before 
out.write() in BlockReceiver.receivePacket() in dn1 and dn2 but not dn3
 
- Details:
 Suppose client appends 16 bytes to block X which has length 16 bytes at dn1, 
dn2, dn3.
Dn1 is primary. The pipeline is dn3-dn2-dn1. recoverBlock succeeds.
Client starts sending data to the dn3 - the first datanode in pipeline.
dn3 forwards the packet to downstream datanodes, and starts writing
data to its disk. Suppose there is an exception in dn3 when writing to disk.
Client gets the exception, it starts the recovery code by calling 
dn1.recoverBlock() again.
dn1 in turn calls dn2.getMetadataInfo() and dn1.getMetaDataInfo() to build the 
syncList.
Suppose at the time getMetadataInfo() is called at both datanodes (dn1 and dn2),
the previous packet (which is sent from dn3) has not come to disk yet.
Hence, the block Info given by getMetaDataInfo contains the length of 16 bytes.
But after that, the packet "comes" to disk, making the block file length now 
becomes 32 bytes.
Using the syncList (with contains block info with length 16 byte), dn1 calls 
updateBlock at
dn2 and dn1, which will failed, because the length of new block info (given by 
updateBlock,
which is 16 byte) does not match with its actual length on disk (which is 32 
byte)
 
Note that this bug is non-deterministic. Its depends on the thread interleaving
at datanodes.

This bug was found by our Failure Testing Service framework:
http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
For questions, please email us: Thanh Do (than...@cs.wisc.edu) and 
Haryadi Gunawi (hary...@eecs.berkeley.edu)




> UpdateBlock fails due to unmatched file length
> ----------------------------------------------
>
>                 Key: HDFS-1227
>                 URL: https://issues.apache.org/jira/browse/HDFS-1227
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.20-append
>            Reporter: Thanh Do
>
> - Summary: client append is not atomic, hence, it is possible that
> when retrying during append, there is an exception in updateBlock
> indicating unmatched file length, making append failed.
>  
> - Setup:
> + # available datanodes = 3
> + # disks / datanode = 1
> + # failures = 1
> + failure type = bad disk
> + When/where failure happens = (see below)
> + This bug is non-deterministic, to reproduce it, add a sufficient sleep 
> before out.write() in BlockReceiver.receivePacket() in dn1 and dn2 but not dn3
>  
> - Details:
>  Suppose client appends 16 bytes to block X which has length 16 bytes at dn1, 
> dn2, dn3.
> Dn1 is primary. The pipeline is dn3-dn2-dn1. recoverBlock succeeds.
> Client starts sending data to the dn3 - the first datanode in pipeline.
> dn3 forwards the packet to downstream datanodes, and starts writing
> data to its disk. Suppose there is an exception in dn3 when writing to disk.
> Client gets the exception, it starts the recovery code by calling 
> dn1.recoverBlock() again.
> dn1 in turn calls dn2.getMetadataInfo() and dn1.getMetaDataInfo() to build 
> the syncList.
> Suppose at the time getMetadataInfo() is called at both datanodes (dn1 and 
> dn2),
> the previous packet (which is sent from dn3) has not come to disk yet.
> Hence, the block Info given by getMetaDataInfo contains the length of 16 
> bytes.
> But after that, the packet "comes" to disk, making the block file length now 
> becomes 32 bytes.
> Using the syncList (with contains block info with length 16 byte), dn1 calls 
> updateBlock at
> dn2 and dn1, which will failed, because the length of new block info (given 
> by updateBlock,
> which is 16 byte) does not match with its actual length on disk (which is 32 
> byte)
>  
> Note that this bug is non-deterministic. Its depends on the thread 
> interleaving
> at datanodes.
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do (than...@cs.wisc.edu) and 
> Haryadi Gunawi (hary...@eecs.berkeley.edu)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to