[ https://issues.apache.org/jira/browse/HDFS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886768#action_12886768 ]
Tsz Wo (Nicholas), SZE commented on HDFS-1094: ---------------------------------------------- > Here the assumption is, correct me if it's wrong, that f nodes fail > simultaneously. Otherwise, we should > take into account replication process, ... I totally agree that we should take account with replication process. That's also the reason that we cannot ignore time T in the equations. A model only consider data loss (i.e. disk failure) probability P without time T, in my opinion, is not useful because simultaneous multiple disk failure probability in real world is really low, probably negligible. I think it would happen only in site-wise disaster situation. Rodrigo, have you observed any case of simultaneous multiple disk failure? > Intelligent block placement policy to decrease probability of block loss > ------------------------------------------------------------------------ > > Key: HDFS-1094 > URL: https://issues.apache.org/jira/browse/HDFS-1094 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Reporter: dhruba borthakur > Assignee: Rodrigo Schmidt > Attachments: prob.pdf, prob.pdf > > > The current HDFS implementation specifies that the first replica is local and > the other two replicas are on any two random nodes on a random remote rack. > This means that if any three datanodes die together, then there is a > non-trivial probability of losing at least one block in the cluster. This > JIRA is to discuss if there is a better algorithm that can lower probability > of losing a block. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.