You can use a randomized reduce key to parallelize the comparison of
different runs.  Each reduce key would be in a small range of integers (say
0..100).  Each reducer would then be in charge of keeping only the best
solution.  The final output would be 100 values which could be compared
conventionally.

Whether this would help really depends on how many runs you have.  If it is
less than millions, this probably doesn't matter and Miles suggestion is
fine.

On Thu, Mar 19, 2009 at 11:54 AM, Miles Osborne <mi...@inf.ed.ac.uk> wrote:

> you won't need any reducers.




-- 
Ted Dunning, CTO
DeepDyve

Reply via email to