[ 
https://issues.apache.org/jira/browse/HDFS-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gruust updated HDFS-12649:
--------------------------
    Description: 
Hadoop's documentation tells me it's suitable for commodity hardware in the 
sense that hardware failures are expected to happen frequently. However, there 
is currently no automatic handling of corrupted blocks, which seems a bit 
contradictory to me.

See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files

This is even problematic for data integrity as the redundancy is not kept at 
the desired level without manual intervention and therefore in a timely manner. 
If there is a corrupted block, I would at least expect that the namenode forces 
the creation of an additional good replica to keep up the redundancy level, ie. 
the redundancy level should never include corrupted data... which it currently 
does:

    "UnderReplicatedBlocks" : 0,
    "CorruptBlocks" : 2,

(namenode /jmx http dump)

  was:
Hadoop's documentation tells me it's suitable for commodity hardware in the 
sense that hardware failures are expected to happen frequently. However, there 
is currently no automatic handling of corrupted blocks, which seems a bit 
contradictory to me.

See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files

This is even problematic for data integrity as the redundancy is not kept at 
the desired level without manual intervention and therefore in a timely manner. 
If there is a corrupted block, I would at least expect that the namenode forces 
the creation of an additional good replica to keep up the redundancy level. 


> handling of corrupt blocks not suitable for commodity hardware
> --------------------------------------------------------------
>
>                 Key: HDFS-12649
>                 URL: https://issues.apache.org/jira/browse/HDFS-12649
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.8.1
>            Reporter: Gruust
>            Priority: Minor
>
> Hadoop's documentation tells me it's suitable for commodity hardware in the 
> sense that hardware failures are expected to happen frequently. However, 
> there is currently no automatic handling of corrupted blocks, which seems a 
> bit contradictory to me.
> See: 
> https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files
> This is even problematic for data integrity as the redundancy is not kept at 
> the desired level without manual intervention and therefore in a timely 
> manner. If there is a corrupted block, I would at least expect that the 
> namenode forces the creation of an additional good replica to keep up the 
> redundancy level, ie. the redundancy level should never include corrupted 
> data... which it currently does:
>     "UnderReplicatedBlocks" : 0,
>     "CorruptBlocks" : 2,
> (namenode /jmx http dump)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to