Re: Barrier between reduce and map of the next round

2010-02-08 Thread Felix Halim
Hi, Currently the barrier between r(i) and m(i+1) is the Job barrier. That is, m(i+1) will be blocked until all r(i) finish (until Job i finish). I'm saying this blocking is not necessary if we can concatenate them all in a single Job as an endless chain. Therefore m(i+1) can start immediately ev

Re: Barrier between reduce and map of the next round

2010-02-08 Thread Amogh Vasekar
Hi, >>m1 | r1 m2 | r2 m3 | ... | r(K-1) mK | rK m(K+1) My understanding is it would be something like: m1|(r1 m2)| m(identity) | r2, if you combine the r(i) and m(i+1), because of the hard distinction between Rs & Ms. Amogh On 2/4/10 1:46 PM, "Felix Halim" wrote: Talking about barrier, curren

Re: avoiding data redistribution in iterative mapreduce

2010-02-08 Thread Amogh Vasekar
Hi, AFAIK no. I'm not sure how much of a task it is to write a HOD-like scheduler, or if its even feasible given the new architecture of single managing JT, directly talking to TT. Probably someone more familiar with the scheduler architecture can help you better. What I was trying to suggest wi

Re: Strange behaviour from a custom Writable

2010-02-08 Thread Amogh Vasekar
Hi, Yes the same location is populated with different values ( returned by iter.next() ) for optimization reasons. There is a new patch which will allow you to mark() and reset() iterator so that you buffer required values ( equivalently you can do that yourself, its anyways in-mem for the patch

Re: Strange behaviour from a custom Writable

2010-02-08 Thread Ed Mazur
Yeah, my understanding is that the iterator is just giving you a pointer to the same location each time. This seems to match up with the behavior we've both observed, but maybe someone more familiar with the internals can verify. Also, in case you didn't know, you can use what's called a secondary

Re: Strange behaviour from a custom Writable

2010-02-08 Thread James Hammerton
Thanks, Ed. I'm copying the values into a list and then sorting them and then emiting the top 20, so yes they are buffered. I'll try cloning each item tomorrow and see if that works. Does this mean the Iterator is returning the same pointer with each call to next() but with different contents bein

Re: Strange behaviour from a custom Writable

2010-02-08 Thread Ed Mazur
Hi James, I ran into something similar in the past and suspect the problem may be in your reduce function. Are you buffering values from the iterator? If you are, then you need to first clone the value when taking it from the iterator (implement Cloneable in your custom Writable). Otherwise they w

Strange behaviour from a custom Writable

2010-02-08 Thread James Hammerton
Hi, For a particular project I created a writable for holding a long and a double called LongDoublePair. My mapper outputs LongDoublePair values and the reducer receives an Iterable. The problem is that when I try to use it, whilst I get the right number of elements in the Iterable, they are all