[ 
https://issues.apache.org/jira/browse/HDFS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856139#action_12856139
 ] 

Brian Bockelman commented on HDFS-1094:
---------------------------------------

Hey Karthik,

Let me play dumb (it might not be playing after all) and try to work out the 
math a bit.

First, let's assume that on any given day, a node has 1/1000 chance of failing.

CURRENT SCHEME: A block is on 3 random nodes.  Probability of loss is a 
simultaneous failure of nodes X, Y, Z.  Let's assume these are independent.  
P(X and Y and Z) = P(X) P(Y) P(Z) = 1 in a billion.

PROPOSED SCHEME:  Well, the probability is the same.

So, given a specific block, we don't change the probability it is lost.

What you seem to be calculating is the probability that three nodes go down out 
of N nodes:

P(nodes X, Y, and Z fail for any three distinct X, Y, Z) = 1 - P(N-3 nodes stay 
up) = 1 - [999/1000]^[N-3]

Sure enough, if you use a small subset (N=40 maybe), then the probability of 3 
nodes failing is smaller for small subsets than the whole cluster.

However, that's not the number you want!  You want the probability that *any* 
block is lost when three nodes go down.  That is, P(nodes X, Y, and Z fail for 
any three distinct X, Y, Z and X, Y, Z have at least one distinct block) (call 
this P_1).  Assuming that overlapping blocks, node death, and subset of nodes 
are all independent, you get:

P_1 = P(three nodes having at least one common block) * P(3 node death) * (# of 
distinct 3-node subsets)

The first number is decreasing with N, the second is constant with N, the third 
is increasing with N.  The third is a well-known formula, while I don't have a 
good formula for the first value.  Unless you can calculate or estimate the 
first, I don't think you can really say anything about decreasing the value of 
P_1.

I *think* we are incorrectly assuming the probability of data loss as being 
proportional to to the probability of 3 machines in a subset being lost without 
taking into account the probability of common blocks.  The probabilities get 
tricky, hence me asking for someone to sketch it out mathematically... 



> Intelligent block placement policy to decrease probability of block loss
> ------------------------------------------------------------------------
>
>                 Key: HDFS-1094
>                 URL: https://issues.apache.org/jira/browse/HDFS-1094
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> The current HDFS implementation specifies that the first replica is local and 
> the other two replicas are on any two random nodes on a random remote rack. 
> This means that if any three datanodes die together, then there is a 
> non-trivial probability of losing at least one block in the cluster. This 
> JIRA is to discuss if there is a better algorithm that can lower probability 
> of losing a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to