[ https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865828#action_12865828 ]
bc Wong commented on HDFS-1001: ------------------------------- Note that this change does not increase the client read latency. The existing code is already sending CHECKSUM_OK even for partial block read. The fix codifies this behaviour and makes sure that the DN side *expects* this CHECKSUM_OK. Another way to fix this bug is to make the client *not* send CHECKSUM_OK for partial block reads. However, that complicates the protocol and makes it unclean. The cost of sending the CHECKSUM_OK is small enough. It is one extra syscall. But it's not network latency since the client doesn't need a response. > DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK > --------------------------------------------------------------------- > > Key: HDFS-1001 > URL: https://issues.apache.org/jira/browse/HDFS-1001 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 0.22.0 > Reporter: bc Wong > Assignee: bc Wong > Attachments: HDFS-1001-2.patch, HDFS-1001-3.patch, HDFS-1001-3.patch, > HDFS-1001-4.patch, HDFS-1001-rebased.patch, HDFS-1001.patch, HDFS-1001.patch.1 > > > Running the TestPread with additional debug statements reveals that the > BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. > Currently it doesn't matter since DataXceiver closes the connection after > each op, and CHECKSUM_OK is the last thing on the wire. But if we want to > cache connections, they need to agree on the exchange of CHECKSUM_OK. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.