Hi Pedro,

Yes, Hadoop Streaming has the same property. The reduce method is not
called until the mappers are done, and the reducers are not scheduled
before the threshold set by mapred.reduce.slowstart.completed.maps is
reached.

On Tue, Jan 15, 2013 at 3:06 PM, Pedro Sá da Costa <psdc1...@gmail.com>wrote:

> Hi,
>
> I read from documents that in MapReduce, the reduce tasks only start
> after a percentage (by default 90%) of maps end. This means that the
> slowest maps can delay the start of reduce tasks, and the input data
> that is consumed by the reduce tasks is represented as a batch of
> data. This means that, the scenario of having reduce tasks consuming
> data as long the map tasks produce it, doesn't exist. But with the in
> Hadoop MapReduce streaming this still happens?
>
> --
> Best regards,
> P
>

Reply via email to