[ 
https://issues.apache.org/jira/browse/HDFS-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897548#comment-13897548
 ] 

Kihwal Lee edited comment on HDFS-5583 at 2/11/14 5:37 AM:
-----------------------------------------------------------

The patch makes DN send OOB acks to clients who are writing.  The added test 
case currently doesn't do much, but after the client-side changes, it will be 
updated.  

The OOB Ack sending can still be verified from running the test new case. The 
test log should show something like following:

{panel}
[DataNode]
2014-02-10 23:23:52,412 INFO  datanode.DataNode 
(DataXceiverServer.java:run(190)) - Shutting down DataXceiverServer before 
restart
2014-02-10 23:23:52,412 INFO  datanode.DataNode 
(BlockReceiver.java:receiveBlock(731)) - Shutting down for restart 
(BP-203907574-10.0.1.17-1392096230619:blk_1073741825_1002).
2014-02-10 23:23:52,413 INFO  datanode.DataNode 
(BlockReceiver.java:sendOOBResponse(977)) - Sending an out of band ack of type 
OOB_TYPE1

[Upstream Datanode]
2014-02-10 23:23:52,413 INFO  datanode.DataNode (BlockReceiver.java:run(1060)) 
- Relaying an out of band ack of type OOB_TYPE

[Client]
2014-02-10 23:23:52,414 WARN  hdfs.DFSClient (DFSOutputStream.java:run(784)) - 
DFSOutputStream ResponseProcessor exception  for block 
BP-203907574-10.0.1.17-1392096230619:blk_1073741825_1002
java.io.IOException: Bad response OOB_TYPE1 for block 
BP-203907574-10.0.1.17-1392096230619:blk_1073741825_1002 from datanode 
127.0.0.1:55182
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:732)
{panel}


was (Author: kihwal):
The patch makes DN send OOB acks to clients who are writing.  The added test 
case currently doesn't do much, but after the client-side changes, it will be 
updated.  

The OOB Ack sending can still be verified from running the test new case. The 
test log should show something like following:

{noformat}
[DataNode]
2014-02-10 23:23:52,412 INFO  datanode.DataNode 
(DataXceiverServer.java:run(190)) - Shutting down DataXceiverServer before 
restart
2014-02-10 23:23:52,412 INFO  datanode.DataNode 
(BlockReceiver.java:receiveBlock(731)) - Shutting down for restart 
(BP-203907574-10.0.1.17-1392096230619:blk_1073741825_1002).
2014-02-10 23:23:52,413 INFO  datanode.DataNode 
(BlockReceiver.java:sendOOBResponse(977)) - Sending an out of band ack of type 
OOB_TYPE1

[Upstream Datanode]
2014-02-10 23:23:52,413 INFO  datanode.DataNode (BlockReceiver.java:run(1060)) 
- Relaying an out of band ack of type OOB_TYPE

[Client]
2014-02-10 23:23:52,414 WARN  hdfs.DFSClient (DFSOutputStream.java:run(784)) - 
DFSOutputStream ResponseProcessor exception  for block 
BP-203907574-10.0.1.17-1392096230619:blk_1073741825_1002
java.io.IOException: Bad response OOB_TYPE1 for block 
BP-203907574-10.0.1.17-1392096230619:blk_1073741825_1002 from datanode 
127.0.0.1:55182
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:732)
{noformat}

> Make DN send an OOB Ack on shutdown before restaring
> ----------------------------------------------------
>
>                 Key: HDFS-5583
>                 URL: https://issues.apache.org/jira/browse/HDFS-5583
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>         Attachments: HDFS-5583.patch
>
>
> Add an ability for data nodes to send an OOB response in order to indicate an 
> upcoming upgrade-restart. Client should ignore the pipeline error from the 
> node for a configured amount of time and try reconstruct the pipeline without 
> excluding the restarted node.  If the node does not come back in time, 
> regular pipeline recovery should happen.
> This feature is useful for the applications with a need to keep blocks local. 
> If the upgrade-restart is fast, the wait is preferable to losing locality.  
> It could also be used in general instead of the draining-writer strategy.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to