[ 
https://issues.apache.org/jira/browse/HDFS-11164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-11164:
----------------------------
    Description: 
When mover is trying to move a pinned block to another datanode, it will 
internally hits the following IOException and mark the block movement as 
{{failure}}. Since the Mover has {{dfs.mover.retry.max.attempts}} configs, it 
will continue moving this block until it reaches {{retryMaxAttempts}}. If the 
block movement failure(s) are only due to block pinning, then retry is 
unnecessary. The idea of this jira is to avoid retry attempts of pinned blocks 
as they won't be able to move to a different node. 

{code}
2016-11-22 10:56:10,537 WARN org.apache.hadoop.hdfs.server.balancer.Dispatcher: 
Failed to move blk_1073741825_1001 with size=52 from 127.0.0.1:19501:DISK to 
127.0.0.1:19758:ARCHIVE through 127.0.0.1:19501
java.io.IOException: Got error, status=ERROR, status message opReplaceBlock 
BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001 received 
exception java.io.IOException: Got error, status=ERROR, status message Not able 
to copy block 1073741825 to /127.0.0.1:19826 because it's pinned , copy block 
BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001 from 
/127.0.0.1:19501, reportedBlock move is failed
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:118)
        at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:417)
        at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:358)
        at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$5(Dispatcher.java:322)
        at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:1075)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{code}

  was:
When mover is trying to move a pinned block to another datanode, it will 
internally hits the following IOException and mark the block movement as 
{{failure}}. Since the Mover has {{dfs.mover.retry.max.attempts}} configs, it 
will continue moving this block until it reaches {{retryMaxAttempts}}. This 
retry is unnecessary and would be good to avoid retry attempts as pinned block 
won't be able to move.

{code}
2016-11-22 10:56:10,537 WARN org.apache.hadoop.hdfs.server.balancer.Dispatcher: 
Failed to move blk_1073741825_1001 with size=52 from 127.0.0.1:19501:DISK to 
127.0.0.1:19758:ARCHIVE through 127.0.0.1:19501
java.io.IOException: Got error, status=ERROR, status message opReplaceBlock 
BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001 received 
exception java.io.IOException: Got error, status=ERROR, status message Not able 
to copy block 1073741825 to /127.0.0.1:19826 because it's pinned , copy block 
BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001 from 
/127.0.0.1:19501, reportedBlock move is failed
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:118)
        at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:417)
        at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:358)
        at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$5(Dispatcher.java:322)
        at 
org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:1075)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{code}


> Mover should avoid unnecessary retries if the block is pinned
> -------------------------------------------------------------
>
>                 Key: HDFS-11164
>                 URL: https://issues.apache.org/jira/browse/HDFS-11164
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer & mover
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: HDFS-11164-00.patch, HDFS-11164-01.patch
>
>
> When mover is trying to move a pinned block to another datanode, it will 
> internally hits the following IOException and mark the block movement as 
> {{failure}}. Since the Mover has {{dfs.mover.retry.max.attempts}} configs, it 
> will continue moving this block until it reaches {{retryMaxAttempts}}. If the 
> block movement failure(s) are only due to block pinning, then retry is 
> unnecessary. The idea of this jira is to avoid retry attempts of pinned 
> blocks as they won't be able to move to a different node. 
> {code}
> 2016-11-22 10:56:10,537 WARN 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher: Failed to move 
> blk_1073741825_1001 with size=52 from 127.0.0.1:19501:DISK to 
> 127.0.0.1:19758:ARCHIVE through 127.0.0.1:19501
> java.io.IOException: Got error, status=ERROR, status message opReplaceBlock 
> BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001 received 
> exception java.io.IOException: Got error, status=ERROR, status message Not 
> able to copy block 1073741825 to /127.0.0.1:19826 because it's pinned , copy 
> block BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001 from 
> /127.0.0.1:19501, reportedBlock move is failed
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:118)
>       at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:417)
>       at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:358)
>       at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$5(Dispatcher.java:322)
>       at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:1075)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to