Re: Design issue for a problem using Map Reduce

2009-02-23 Thread some speed
Thanks Sagar...That helps to a certain extent.
But is dependency not a common occurrence among equations? Doesn't Hadoop
provide a way to solve such equations in parallel?
Going in for a sequential calculation might prove to be a major performance
degradation given tens of thousands of numbers. Does any one have any ideas
?

Thanks.

On Sun, Feb 15, 2009 at 1:34 AM, Sagar Naik sn...@attributor.com wrote:

 Here is one thought
 N maps and 1 Reduce,
 input to map: t,w(t)
 output of map t, w(t)*w(t)
 I assume t is an integer. So in case of 1 reducer, u will receive
 t0, square(w(0)
 t1, square(w(1)
 t2, square(w(2)
 t3, square(w(3)
 Note this wiil be a sorted series on t.

 in reduce

 static prevF = 0;

 reduce(t, square_w_t)
 {
  f = square_w_t * A  + B * prevF ;
  output.collect(t,f)
  prevF = f
 }

 According to me the step of B*F(t-1) is inherently sequential.
 So all we can do is parallelize the a*w(t)*w(t) part.

 -Sagar


 some speed wrote:

 Hello all,

 I am trying to implement a Map Reduce Chain to solve a particular
 statistic
 problem. I have come to a point where I have to solve the following type
 of
 equation in Hadoop:

 F(t)= A*w(t)*w(t) + B*F(t-1); Given: F(0)=0, A and B are Alpha and
 Beta
 and their values are known.

 Now, W is series of numbers (There could be *a million* or more numbers).

 So to Solve the equation in terms of Map Reduce, there are basically 2
 issues which I can think of:

 1) How will I be able to get the value of F(t-1) since it means as each
 step
 i need the value from the previous iteration. And that is not possible
 while
 computing parallely.
 2) the w(t) values have to be read and applied in order also ,and, again
 that is a prb while computing parallely.

 Can some please help me go abt this problem and overcome the issues?

 Thanks,

 Sharath






Design issue for a problem using Map Reduce

2009-02-14 Thread some speed
Hello all,

I am trying to implement a Map Reduce Chain to solve a particular statistic
problem. I have come to a point where I have to solve the following type of
equation in Hadoop:

F(t)= A*w(t)*w(t) + B*F(t-1); Given: F(0)=0, A and B are Alpha and Beta
and their values are known.

Now, W is series of numbers (There could be *a million* or more numbers).

So to Solve the equation in terms of Map Reduce, there are basically 2
issues which I can think of:

1) How will I be able to get the value of F(t-1) since it means as each step
i need the value from the previous iteration. And that is not possible while
computing parallely.
2) the w(t) values have to be read and applied in order also ,and, again
that is a prb while computing parallely.

Can some please help me go abt this problem and overcome the issues?

Thanks,

Sharath


Re: Design issue for a problem using Map Reduce

2009-02-14 Thread Sagar Naik

Here is one thought
N maps and 1 Reduce,
input to map: t,w(t)
output of map t, w(t)*w(t)
I assume t is an integer. So in case of 1 reducer, u will receive
t0, square(w(0)
t1, square(w(1)
t2, square(w(2)
t3, square(w(3)
Note this wiil be a sorted series on t.

in reduce

static prevF = 0;

reduce(t, square_w_t)
{
  f = square_w_t * A  + B * prevF ;
  output.collect(t,f)
  prevF = f
}

According to me the step of B*F(t-1) is inherently sequential.
So all we can do is parallelize the a*w(t)*w(t) part.

-Sagar

some speed wrote:

Hello all,

I am trying to implement a Map Reduce Chain to solve a particular statistic
problem. I have come to a point where I have to solve the following type of
equation in Hadoop:

F(t)= A*w(t)*w(t) + B*F(t-1); Given: F(0)=0, A and B are Alpha and Beta
and their values are known.

Now, W is series of numbers (There could be *a million* or more numbers).

So to Solve the equation in terms of Map Reduce, there are basically 2
issues which I can think of:

1) How will I be able to get the value of F(t-1) since it means as each step
i need the value from the previous iteration. And that is not possible while
computing parallely.
2) the w(t) values have to be read and applied in order also ,and, again
that is a prb while computing parallely.

Can some please help me go abt this problem and overcome the issues?

Thanks,

Sharath