RE: Hadoop streaming: How is data distributed from mappers to reducers?

2009-08-24 Thread Amogh Vasekar
streaming: How is data distributed from mappers to reducers? Does that mean that, if the same key is emitted more than once from a mapper, it is not necessary that the key value pairs (for that same key) will go to the same reducer? -Nipun On Tue, Aug 25, 2009 at 6:13 AM, Aaron Kimball wrote

Re: Hadoop streaming: How is data distributed from mappers to reducers?

2009-08-24 Thread Nipun Saggar
Does that mean that, if the same key is emitted more than once from a mapper, it is not necessary that the key value pairs (for that same key) will go to the same reducer? -Nipun On Tue, Aug 25, 2009 at 6:13 AM, Aaron Kimball wrote: > Yes. It works just like Java-based MapReduce in that regard.

Re: Hadoop streaming: How is data distributed from mappers to reducers?

2009-08-24 Thread Aaron Kimball
Yes. It works just like Java-based MapReduce in that regard. - Aaron On Sun, Aug 23, 2009 at 5:09 AM, Nipun Saggar wrote: > Hi all, > > I have recently started using Hadoop streaming. From the documentation, I > understand that by default, each line output from a mapper up to the first > tab beco

Hadoop streaming: How is data distributed from mappers to reducers?

2009-08-23 Thread Nipun Saggar
Hi all, I have recently started using Hadoop streaming. From the documentation, I understand that by default, each line output from a mapper up to the first tab becomes the key and rest of the line is the value. I wanted to know that between the mapper and reducer, is there a shuffling(sorting) ph