Thanks everyone.
After setting the HADOOP_CLIENT_OPTS, the error changed to that the number
of tasks my job was launching was more than 100,000 which I believe is the
maximum set on my cluster.
This was because I had more than 100,000 files input to my job. I merged
some files so that the total nu
On Jul 24, 2011, at 2:34 AM, Joey Echeverria wrote:
> You're running out of memory trying to generate the splits. You need to set a
> bigger heap for your driver program. Assuming you're using the hadoop jar
> command to launch your job, you can do this by setting HADOOP_HEAPSIZE to a
> larger
You're running out of memory trying to generate the splits. You need to set
a bigger heap for your driver program. Assuming you're using the hadoop jar
command to launch your job, you can do this by setting HADOOP_HEAPSIZE to a
larger value in $HADOOP_HOME/conf/hadoop-env.sh
-Joey
On Jul 24, 2011
Try with a higher heap size. Maybe the issue is due to too many splits
being generated at the client side, leading to the heap filling up
(iirc default heap would be used for RunJar ops, unless you pass
HADOOP_CLIENT_OPTS=-Xmx512m or so to raise it).
On Sun, Jul 24, 2011 at 2:36 PM, Gagan Bansal
Hi All,
I am getting the following error on running a job on about 12 TB of data.
This happens before any mappers or reducers are launched.
Also the job starts fine if I reduce the amount of input data. Any ideas as
to what may be the reason for this error?
Exception in thread "main" java.lang.Ou