[jira] Commented: (HDFS-268) Distinguishing file missing/corruption for low replication files

Robert Chansler (JIRA) Thu, 02 Jul 2009 17:32:14 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726750#action_12726750
 ]


Robert Chansler commented on HDFS-268:
--------------------------------------

Does anybody have robust evidence that making the replication factor small (or 
large, that happens, too!) really helps for an example that is neither 
contrived nor too small to justify optimizing the system?

I'd be content not to allow small RF. But if the will of the user is respected, 
life is too short for system administrators to be interrupted whenever a block 
is lost from a file with small RF. Diagnostic and monitoring tools should 
quickly dismiss alerts for such lost blocks.

Maybe the system should just heal itself. Delete the broken file and get on 
with life. Or truncate it to the last good block. Or append "You lose!" to the 
file name.

> Distinguishing file missing/corruption for low replication files
> ----------------------------------------------------------------
>
>                 Key: HDFS-268
>                 URL: https://issues.apache.org/jira/browse/HDFS-268
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Koji Noguchi
>
> In PIG-856, there's a discussion about reducing the replication factor for 
> intermediate files between jobs.
> I've seen users doing the same in mapreduce jobs getting some speed up. (I 
> believe their outputs were too small to benefit from the pipelining.)
> Problem is, when users start changing replications to 1 (or 2), ops starts 
> seeing alerts from fsck and HADOOP-4103 even with a single datanode failure.
> Also, problem of Namenode not getting out of safemode when restarted.
> My answer has been asking the users  "please don't change the replication 
> less than 3".
> But is this the right approach?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-268) Distinguishing file missing/corruption for low replication files

Reply via email to