So why it's called hadoop streaming, if it doesn't behave like a
streaming application (The reduces don't receive data as long as it is
produced by the map tasks)?
On 16 January 2013 05:41, Jeff Bean wrote:
> me property. The reduce method is not called until the mappers are done, and
> the redu
It's called Hadoop Streaming because keys and values are streamed in to
stdin of the script you specify for Hadoop Streaming and then captured via
stdout.
On Wed, Jan 16, 2013 at 1:04 AM, Pedro Sá da Costa wrote:
> So why it's called hadoop streaming, if it doesn't behave like a
> streaming appli
Hi,
Thanks for the response. There was some issues with my code. I have checked
that in detail.
All the values of map are present in reducer but not in sorted order. This case
happens if the number of values are too large for a key.
Thanks
Utkarsh
From: Vinod Kumar Vavilapalli [mailto:vino...@h
We don't sort values (only keys) nor apply any manual limits in MR. Can
your post a reproduceable test case to support your suspicion?
On Jan 16, 2013 4:34 PM, "Utkarsh Gupta" wrote:
> Hi,
>
> Thanks for the response. There was some issues with my code. I have
> checked that in detail.
The patch has not been contributed yet. Upstream at open-mpi there does
seem to be a branch that makes some reference to Hadoop, but I think the
features are yet to be made available there too. Apparently waiting on some
form of a product release first? That's all I could gather from some
sleuthing