Re: Do all Mapper outputs with same key go to same Reducer?

2008-09-19 Thread Per Jacobsson
> If that's true, then can I set the number of Reducers very high > (even equal to the number of maps) to make Job C go faster? This page has some good info on finding the right number of reducers: http://wiki.apache.org/hadoop/HowManyMapsAndReduces / Per On Fri, Sep 19, 2008 at 9:42 AM, Miles

Re: Do all Mapper outputs with same key go to same Reducer?

2008-09-19 Thread Miles Osborne
> So here's my question -- does Hadoop guarantee that all records with the same key will end up in the same Reducer task? If that's true, > yes --think of the record as being sent to the task by hashing over the key Miles 2008/9/19 Stuart Sierra <[EMAIL PROTECTED]>: > Hi all, > The short versio

Do all Mapper outputs with same key go to same Reducer?

2008-09-19 Thread Stuart Sierra
Hi all, The short version of my question is in the subject. Here's the long version: I have two map/reduce jobs that output records using a common key: Job A: K1 => A1,1 K1 => A1,2 K2 => A2,1 K2 => A2,2 Job B: K1 => B1 K2 => B2 K3 => B3 And a third job that merges records with the