Re: Job tracker error

2011-07-25 Thread Gagan Bansal
Thanks everyone. After setting the HADOOP_CLIENT_OPTS, the error changed to that the number of tasks my job was launching was more than 100,000 which I believe is the maximum set on my cluster. This was because I had more than 100,000 files input to my job. I merged some files so that the total nu

Re: Job tracker error

2011-07-24 Thread Arun C Murthy
On Jul 24, 2011, at 2:34 AM, Joey Echeverria wrote: > You're running out of memory trying to generate the splits. You need to set a > bigger heap for your driver program. Assuming you're using the hadoop jar > command to launch your job, you can do this by setting HADOOP_HEAPSIZE to a > larger

Re: Job tracker error

2011-07-24 Thread Joey Echeverria
You're running out of memory trying to generate the splits. You need to set a bigger heap for your driver program. Assuming you're using the hadoop jar command to launch your job, you can do this by setting HADOOP_HEAPSIZE to a larger value in $HADOOP_HOME/conf/hadoop-env.sh -Joey On Jul 24, 2011

Re: Job tracker error

2011-07-24 Thread Harsh J
Try with a higher heap size. Maybe the issue is due to too many splits being generated at the client side, leading to the heap filling up (iirc default heap would be used for RunJar ops, unless you pass HADOOP_CLIENT_OPTS=-Xmx512m or so to raise it). On Sun, Jul 24, 2011 at 2:36 PM, Gagan Bansal

Job tracker error

2011-07-24 Thread Gagan Bansal
Hi All, I am getting the following error on running a job on about 12 TB of data. This happens before any mappers or reducers are launched. Also the job starts fine if I reduce the amount of input data. Any ideas as to what may be the reason for this error? Exception in thread "main" java.lang.Ou