Try with a higher heap size. Maybe the issue is due to too many splits
being generated at the client side, leading to the heap filling up
(iirc default heap would be used for RunJar ops, unless you pass
HADOOP_CLIENT_OPTS=-Xmx512m or so to raise it).

On Sun, Jul 24, 2011 at 2:36 PM, Gagan Bansal <gagan.ban...@gmail.com> wrote:
> Hi All,
> I am getting the following error on running a job on about 12 TB of data.
> This happens before any mappers or reducers are launched.
> Also the job starts fine if I reduce the amount of input data. Any ideas as
> to what may be the reason for this error?
> Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit
> exceeded
> at java.util.Arrays.copyOf(Arrays.java:2786)
> at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:71)
> at java.io.DataOutputStream.writeByte(DataOutputStream.java:136)
> at org.apache.hadoop.io.UTF8.writeChars(UTF8.java:278)
> at org.apache.hadoop.io.UTF8.writeString(UTF8.java:250)
> at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:131)
> at org.apache.hadoop.ipc.RPC$Invocation.write(RPC.java:111)
> at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:741)
> at org.apache.hadoop.ipc.Client.call(Client.java:1011)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
> at $Proxy6.getBlockLocations(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> at $Proxy6.getBlockLocations(Unknown Source)
> at
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:359)
> at org.apache.hadoop.hdfs.DFSClient.getBlockLocations(DFSClient.java:380)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileBlockLocations(DistributedFileSystem.java:178)
> at
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:234)
> at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:946)
> at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:938)
> at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:854)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:807)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:807)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:781)
> at
> org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:876)
> Gagan Bansal
>



-- 
Harsh J

Reply via email to