FWIW, I usually specify the number of reducers in both streaming and
against the Java API. The "default" is what's read from your config
files on the submitting node.
Nick Jones
On Oct 21, 2011, at 5:00 PM, Mapred Learn wrote:
> Hi,
> Does streaming jar create 1 reducer
I found Cloudera's distribution easy to use, but it's the only thing I tried.
Nick
On Tue, Feb 22, 2011 at 9:42 PM, real great..
wrote:
> Hi,
> Very trivial question.
> Which is the easiest way to install hadoop?
> i mean which distribution should i go for?? apache or cloudera?
> n which is th
The number of refugees is normally the number of output files desired
as well. Forcing a large job to output to one or a few files can make
a job take a very long time.
Nick Jones
Sent by radiation.
On Aug 27, 2010, at 11:52 PM, Xin Feng wrote:
> Did you mean that i should include:
>
Hi,
It's true that Linux is a more well supported platform but Windows with
cygwin does work.
Nick Jones
Sent by radiation.
On Jul 21, 2010, at 5:28 PM, Khaled BEN BAHRI
wrote:
Hi :)
Windows is not well test yet as a production platform, GNU/Linux is better
than windows for using hado
Hi Giridhar,
Can you share your code somewhere?
Nick Jones
Sent by radiation.
On Jun 8, 2010, at 7:21 AM, "Giridhar Addepalli" > wrote:
Hi Sonal,
I am using Hadoop 0.20.2. Is this okay ?
Thanks for the suggestion , will look at the hiho framework.
Thanks,
Giridhar.
-Ori
e this:
http://hadoop.apache.org/common/docs/current/mapred-default.html
HTH,
DR
Couldn't the DistributedCache idea still work with a chained set of
jobs? Map the first set into files on the DFS and add them to the DC
for the next time through?
Nick Jones
I think the biggest issue would be upstream bandwidth and latency. If the
thought was to use a Seti type approach, most users wouldn't have the
necessary upstream bandwidth to support the DFS. It would be likely that a
few local desktop machines would significantly out pace a much larger
DSL/cabl