[ 
https://issues.apache.org/jira/browse/ACCUMULO-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217898#comment-15217898
 ] 

Dave Marion commented on ACCUMULO-4004:
---------------------------------------

Basically decommissioning is broken right now in Hadoop 2.

WALogs stay open until they hit the size threshold, which could be many hours 
or days in some cases. These open files will prevent a DN from finishing its 
decommissioning process[1]. If you stop the DN, then the WALog file will not be 
closed and you could lose data. You have to find the tservers that are writing 
to the WALog and stop them so that the WALog is closed.

There is also another nasty bug[2] where the NN gives clients old locations of 
blocks that have been moved due to decommissioning. As you can imagine this can 
create all kinds of problems. Then, there is [3] with all of its related issues.

With this patch, you can set the max age to the amount of time you are willing 
to wait for a DN to decommission (if you choose to take the risk of hitting 
[2]).

[1] https://issues.apache.org/jira/browse/HDFS-3599
[2] https://issues.apache.org/jira/browse/HDFS-8208
[3] https://issues.apache.org/jira/browse/HDFS-8406

> open WALs prevent DN decommissioning
> ------------------------------------
>
>                 Key: ACCUMULO-4004
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4004
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>             Fix For: 1.6.6, 1.7.2, 1.8.0
>
>         Attachments: ACCUMULO-4004-1.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> It should be possible to manually roll WALs so that files on decommissioning 
> datanodes are closed and the decommissioning process can complete. At the 
> very least, the logs could be closed after an elapsed period of time, such as 
> an hour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to