[ 
https://issues.apache.org/jira/browse/HDFS-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-1850:
------------------------------

    Attachment: hdfs-1850-1.patch

Patch attached.

1. Modifies FSDataset to track and report volume failures like other capacity 
etc. Adds the test listed in the description, makes 
TestDataNodeVolumeFailureReporting more robust.

2. Renames the volumesFailed metric to volumeFailures to accurately reflect 
what it's tracking. This doesn't break compatibility because this metric (added 
in HDFS-811) has not yet been released.

> DN should transmit absolute failed volume count rather than increments to the 
> NN
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-1850
>                 URL: https://issues.apache.org/jira/browse/HDFS-1850
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, name-node
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 0.23.0
>
>         Attachments: hdfs-1850-1.patch
>
>
> The API added in HDFS-811 for the DN to report volume failures to the NN is 
> "inc(DN)". However the given sequence of events will result in the NN 
> forgetting about reported failed volumes:
> # DN loses a volume and reports it
> # NN restarts
> # DN re-registers to the new NN
> A more robust interface would be to have the DN report the total number of 
> volume failures to the NN each heart beat (the same way other volume state is 
> transmitted).
> This will likely be an incompatible change since it requires changing the 
> Datanode protocol.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to