[ 
https://issues.apache.org/jira/browse/HDDS-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Supratim Deka updated HDDS-1595:
--------------------------------
    Attachment: Handling IO Failures on the Datanode.pdf

> Handling IO Failures on the Datanode
> ------------------------------------
>
>                 Key: HDDS-1595
>                 URL: https://issues.apache.org/jira/browse/HDDS-1595
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>          Components: Ozone Datanode
>            Reporter: Supratim Deka
>            Priority: Major
>         Attachments: Handling IO Failures on the Datanode.pdf, Raft IO v2.svg
>
>
> This Jira covers all the changes required to handle IO Failures on the 
> Datanode. Handling an IO failure on the Datanode involves detecting failures 
> as they happen and propagating the failure to the appropriate component in 
> the system - possibly the Client and/or the SCM based on the type of failure.
> At a high-level, IO Failure handling has the following goals:
> 1. Prevent Inconsistencies and corruption - due to non-handling or 
> mishandling of failures.
> 2. Prevent any data loss - timely detection of failure and propagate correct 
> error back to the initiator instead of silently dropping the data while the 
> client assumes the operation is committed.
> 3. Contain the disruption in the system - if a disk volume fails on a DN, 
> operations to the other nodes and volumes should not be affected.
> Details pertaining to design and changes required are covered in the attached 
> pdf document.
> A sequence diagram used to analyse the Datanode IO Path is also attached, in 
> svg format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to