[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-11-19 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933735#action_12933735
 ] 

Eli Collins commented on HDFS-1001:
---

Unit test results look the same as trunk before the latest patch.
 
{noformat}
 [exec] 
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 9 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 system test framework.  The patch passed system test 
framework compile.
 [exec] 
{noformat}

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Reporter: bc Wong
>Assignee: bc Wong
>Priority: Minor
> Fix For: 0.23.0
>
> Attachments: HDFS-1001-2.patch, HDFS-1001-3.patch, HDFS-1001-3.patch, 
> HDFS-1001-4.patch, HDFS-1001-5.patch, HDFS-1001-6.patch, 
> HDFS-1001-rebased.patch, HDFS-1001.patch, HDFS-1001.patch.1
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-05-10 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865936#action_12865936
 ] 

Eli Collins commented on HDFS-1001:
---

bq. Another way to fix this bug is to make the client not send CHECKSUM_OK for 
partial block reads. However, that complicates the protocol and makes it 
unclean. 

Agree the way you currently have it is better.  Latest patch looks good to me.

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>Assignee: bc Wong
> Attachments: HDFS-1001-2.patch, HDFS-1001-3.patch, HDFS-1001-3.patch, 
> HDFS-1001-4.patch, HDFS-1001-rebased.patch, HDFS-1001.patch, HDFS-1001.patch.1
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-05-10 Thread bc Wong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865828#action_12865828
 ] 

bc Wong commented on HDFS-1001:
---

Note that this change does not increase the client read latency. The existing 
code is already sending CHECKSUM_OK even for partial block read. The fix 
codifies this behaviour and makes sure that the DN side *expects* this 
CHECKSUM_OK.

Another way to fix this bug is to make the client *not* send CHECKSUM_OK for 
partial block reads. However, that complicates the protocol and makes it 
unclean. The cost of sending the CHECKSUM_OK is small enough. It is one extra 
syscall. But it's not network latency since the client doesn't need a response.

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>Assignee: bc Wong
> Attachments: HDFS-1001-2.patch, HDFS-1001-3.patch, HDFS-1001-3.patch, 
> HDFS-1001-4.patch, HDFS-1001-rebased.patch, HDFS-1001.patch, HDFS-1001.patch.1
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-04-29 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862370#action_12862370
 ] 

Eli Collins commented on HDFS-1001:
---

+1 Nice change. 

Nits:
* in DataNode  "DataNode always expects" should read "always checks" since the 
response is optional. 
* Would rename readCasually to something like readBytesCheckEOS
* In DataXceiver "from client" should read "from the client" 



> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>Assignee: bc Wong
> Attachments: HDFS-1001-2.patch, HDFS-1001-3.patch, HDFS-1001-3.patch, 
> HDFS-1001-rebased.patch, HDFS-1001.patch, HDFS-1001.patch.1
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-18 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12847052#action_12847052
 ] 

Todd Lipcon commented on HDFS-1001:
---

The body of patch looks good to me. But could we merge 
TestClientBlockVerification and the new TestDataXceiver? I recall you made the 
new test so you could be in the server.datanode package, but could the cases of 
TestClientBlockVerification move in here too? If not, maybe we can at least 
share a bit of the test code (most of the code except for the test case itself 
is duplicated)

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>Assignee: bc Wong
> Attachments: HDFS-1001-rebased.patch, HDFS-1001.patch, 
> HDFS-1001.patch.1
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846574#action_12846574
 ] 

Hadoop QA commented on HDFS-1001:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12438935/HDFS-1001-rebased.patch
  against trunk revision 923467.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/271/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/271/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/271/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/271/console

This message is automatically generated.

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>Assignee: bc Wong
> Attachments: HDFS-1001-rebased.patch, HDFS-1001.patch, 
> HDFS-1001.patch.1
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-02 Thread bc Wong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840342#action_12840342
 ] 

bc Wong commented on HDFS-1001:
---

Test failure (in TestSecurityTokenEditLog) seems to be HDFS-1015.

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>Assignee: bc Wong
> Attachments: HDFS-1001.patch, HDFS-1001.patch.1
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840331#action_12840331
 ] 

Hadoop QA commented on HDFS-1001:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12437622/HDFS-1001.patch
  against trunk revision 916902.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/258/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/258/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/258/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/258/console

This message is automatically generated.

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>Assignee: bc Wong
> Attachments: HDFS-1001.patch, HDFS-1001.patch.1
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-02 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840316#action_12840316
 ] 

Todd Lipcon commented on HDFS-1001:
---

bq. The test fails if I verify the opposite.

But what if you verify what you're trying to test, and then revert the code so 
it breaks? I think the way you're using the spy right now will always think 
there are no interactions.

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>Assignee: bc Wong
> Attachments: HDFS-1001.patch
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-02 Thread bc Wong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840314#action_12840314
 ] 

bc Wong commented on HDFS-1001:
---

> Can you verify that the test fails properly if the change to the code is 
> reverted?

The test fails if I verify the opposite.

{code}
Testcase: testCompletePartialRead took 0.609 sec
Caused an ERROR

Wanted but not invoked:
dataBlockScanner.verifiedByClient(
isA(org.apache.hadoop.hdfs.protocol.Block)
);
-> at 
org.apache.hadoop.hdfs.server.datanode.TestDataXceiver.testCompletePartialRead(TestDataXceiver.java:146)
Actually, there were zero interactions with this mock.

Wanted but not invoked:
dataBlockScanner.verifiedByClient(
isA(org.apache.hadoop.hdfs.protocol.Block)
);
-> at 
org.apache.hadoop.hdfs.server.datanode.TestDataXceiver.testCompletePartialRead(TestDataXceiver.java:146)
Actually, there were zero interactions with this mock.

at 
org.apache.hadoop.hdfs.server.datanode.TestDataXceiver.testCompletePartialRead(TestDataXceiver.java:146)
{code}

According to Mockito documentation, they don't recommend operating on the spy 
object directly. Also see Q1 here:


> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>Assignee: bc Wong
> Attachments: HDFS-1001.patch
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-02 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840292#action_12840292
 ] 

Todd Lipcon commented on HDFS-1001:
---

Hey bc,
I think there may be a small error in the test. In testCompletePartialRead, you 
spy on the blockScanner, but don't assign the spy back into this.getDataNode() 
- thus the verifiedByClient call won't go through the spy object and the test 
will always pass (at least by my understanding of how mockito works). I think 
you need to do:

{code}
DataBlockScanner scanner = spy(this.getDataNode().blockScanner);
this.getDatanode().blockScanner = scanner;
...
{code}

Can you verify that the test fails properly if the change to the code is 
reverted?


> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>Assignee: bc Wong
> Attachments: HDFS-1001.patch
>
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840051#action_12840051
 ] 

Hadoop QA commented on HDFS-1001:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org
  against trunk revision 916902.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/257/console

This message is automatically generated.

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-01 Thread bc Wong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840048#action_12840048
 ] 

bc Wong commented on HDFS-1001:
---

I have a patch. But I can't assign this issue to myself. Could someone please 
fix Jira to let me work on it? Thanks.

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-02-26 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839021#action_12839021
 ] 

Todd Lipcon commented on HDFS-1001:
---

Testing this is a bit difficult. I think the one _very_ important test is the 
negative test - ie that we don't end up calling 
datanode.blockScanner.verifiedByClient() for a partial read.

I think you could add this as another test case in TestClientBlockVerification 
- just like testCompletePartialRead, except also replace the 
datanode.blockScanner with a spy and verify that verifiedByClient is *not* 
called for the partial read.


> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-02-26 Thread bc Wong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839015#action_12839015
 ] 

bc Wong commented on HDFS-1001:
---

Yes, Todd. The client sends CHECKSUM_OK when it reads to the end of the data. 
The DN expects CHECKSUM_OK only when it sends the entire block.

It's my goal that either side can independently decide whether a CHECKSUM_OK is 
supposed to be present. (A good protocol should do that.) So I propose:
1. The client keeps its current behaviour, sending CHECKSUM_OK when it reads to 
the end of the data.
2. The DN, after sending all the data, always expects a CHECKSUM_OK. If it's 
not there, it will close the client connection.
3. Point #2 does not matter currently, since the DN always closes the client 
connection. But HDFS-941 may change that.

I can't think of a good way to write a test for this. Any suggestions please?

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-02-25 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838242#action_12838242
 ] 

Todd Lipcon commented on HDFS-1001:
---

This is a bug on the datanode side, right? The tests from HDFS-877 at least 
ensure that CHECKSUM_OK works properly where it's supposed to.

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.