[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865828#action_12865828
 ] 

bc Wong commented on HDFS-1001:
-------------------------------

Note that this change does not increase the client read latency. The existing 
code is already sending CHECKSUM_OK even for partial block read. The fix 
codifies this behaviour and makes sure that the DN side *expects* this 
CHECKSUM_OK.

Another way to fix this bug is to make the client *not* send CHECKSUM_OK for 
partial block reads. However, that complicates the protocol and makes it 
unclean. The cost of sending the CHECKSUM_OK is small enough. It is one extra 
syscall. But it's not network latency since the client doesn't need a response.

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> ---------------------------------------------------------------------
>
>                 Key: HDFS-1001
>                 URL: https://issues.apache.org/jira/browse/HDFS-1001
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.22.0
>            Reporter: bc Wong
>            Assignee: bc Wong
>         Attachments: HDFS-1001-2.patch, HDFS-1001-3.patch, HDFS-1001-3.patch, 
> HDFS-1001-4.patch, HDFS-1001-rebased.patch, HDFS-1001.patch, HDFS-1001.patch.1
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to