[jira] [Updated] (HDFS-7374) Allow decommissioning of dead DataNodes

2014-11-18 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-7374:
--
   Resolution: Fixed
Fix Version/s: 2.7.0
   Status: Resolved  (was: Patch Available)

Thanks for the final sign-off ATM, I've committed this to trunk and branch-2. 
Thanks Zhe for the patch and Ming for the helpful advice throughout.

> Allow decommissioning of dead DataNodes
> ---
>
> Key: HDFS-7374
> URL: https://issues.apache.org/jira/browse/HDFS-7374
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Fix For: 2.7.0
>
> Attachments: HDFS-7374-001.patch, HDFS-7374-002.patch, 
> HDFS-7374.003.patch
>
>
> We have seen the use case of decommissioning DataNodes that are already dead 
> or unresponsive, and not expected to rejoin the cluster.
> The logic introduced by HDFS-6791 will mark those nodes as 
> {{DECOMMISSION_INPROGRESS}}, with a hope that they can come back and finish 
> the decommission work. If an upper layer application is monitoring the 
> decommissioning progress, it will hang forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7374) Allow decommissioning of dead DataNodes

2014-11-17 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-7374:
--
Attachment: HDFS-7374.003.patch

I went ahead and made the small change required in 
DatanodeManager#registerDatanode to make this work. The registration steps were 
happening in a different order for the "known DN" vs "unknown DN" cases, so I 
just reordered things to make sure an unknown DN is marked as alive before we 
check decom state.

> Allow decommissioning of dead DataNodes
> ---
>
> Key: HDFS-7374
> URL: https://issues.apache.org/jira/browse/HDFS-7374
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-7374-001.patch, HDFS-7374-002.patch, 
> HDFS-7374.003.patch
>
>
> We have seen the use case of decommissioning DataNodes that are already dead 
> or unresponsive, and not expected to rejoin the cluster.
> The logic introduced by HDFS-6791 will mark those nodes as 
> {{DECOMMISSION_INPROGRESS}}, with a hope that they can come back and finish 
> the decommission work. If an upper layer application is monitoring the 
> decommissioning progress, it will hang forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7374) Allow decommissioning of dead DataNodes

2014-11-13 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7374:

Attachment: HDFS-7374-002.patch

Thanks [~andrew.wang] for the review! The previous patch missed the {DEAD, 
DECOMM_IN_PROGRESS} -> {DEAD, DECOMMED} case. And I agree with moving the logic 
to {{startDecommission}}. The revised patch has also addressed the comments on 
unit testing code.

> Allow decommissioning of dead DataNodes
> ---
>
> Key: HDFS-7374
> URL: https://issues.apache.org/jira/browse/HDFS-7374
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-7374-001.patch, HDFS-7374-002.patch
>
>
> We have seen the use case of decommissioning DataNodes that are already dead 
> or unresponsive, and not expected to rejoin the cluster.
> The logic introduced by HDFS-6791 will mark those nodes as 
> {{DECOMMISSION_INPROGRESS}}, with a hope that they can come back and finish 
> the decommission work. If an upper layer application is monitoring the 
> decommissioning progress, it will hang forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7374) Allow decommissioning of dead DataNodes

2014-11-07 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7374:

Attachment: HDFS-7374-001.patch

Thanks [~mingma] again for the comment.

This patch implements option #2. It also moved an utility function to 
{{DFSTestUtil}} so it's accessible in the new unit test.

> Allow decommissioning of dead DataNodes
> ---
>
> Key: HDFS-7374
> URL: https://issues.apache.org/jira/browse/HDFS-7374
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-7374-001.patch
>
>
> We have seen the use case of decommissioning DataNodes that are already dead 
> or unresponsive, and not expected to rejoin the cluster.
> The logic introduced by HDFS-6791 will mark those nodes as 
> {{DECOMMISSION_INPROGRESS}}, with a hope that they can come back and finish 
> the decommission work. If an upper layer application is monitoring the 
> decommissioning progress, it will hang forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7374) Allow decommissioning of dead DataNodes

2014-11-07 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7374:

Status: Patch Available  (was: Open)

> Allow decommissioning of dead DataNodes
> ---
>
> Key: HDFS-7374
> URL: https://issues.apache.org/jira/browse/HDFS-7374
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>
> We have seen the use case of decommissioning DataNodes that are already dead 
> or unresponsive, and not expected to rejoin the cluster.
> The logic introduced by HDFS-6791 will mark those nodes as 
> {{DECOMMISSION_INPROGRESS}}, with a hope that they can come back and finish 
> the decommission work. If an upper layer application is monitoring the 
> decommissioning progress, it will hang forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)