Thanks Joey.
I guess the puzzle then is why some of my mappers used up over 312mb,
leaving insufficient room out of the 512mb total we allocate when the job is
no more complex than other jobs that run happily in that space. The memory
usage is independent of the size of my data set, and even the l
It was initializing a 200MB buffer to do the sorting of the output in.
How much space did you allocate the task JVMs (mapred.child.java.opts
in mapred-site.xml)?
If you didn't change the default, it's set to 200MB which is why you
would run out of error trying to allocate a 200MB buffer.
-Joey
O
Hi,
I lowered the io.sort.mb to 100mb from 200mb and that allowed my job to get
through the mapping phase, thanks Chris.
However what I don't understand is why the memory got used up in the first
place when the mapper only buffers the previous input and the maximum
serialised size of the objects
Lower io.sort.mb or raise the heap size for the task. -C
On Tue, Apr 26, 2011 at 10:55 AM, James Hammerton
wrote:
> Hi,
>
> I have a job that runs fine with a small data set in pseudo-distributed mode
> on my desktop workstation but when I run it on our Hadoop cluster it falls
> over, crashing du
Hi,
I have a job that runs fine with a small data set in pseudo-distributed mode
on my desktop workstation but when I run it on our Hadoop cluster it falls
over, crashing during the initialisation of some of the mappers. The errors
look like this:
2011-04-26 14:34:04,494 FATAL org.apache.hadoop.m