Hi All,

I am getting the following error on running a job on about 12 TB of data.
This happens before any mappers or reducers are launched.
Also the job starts fine if I reduce the amount of input data. Any ideas as
to what may be the reason for this error?

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit
exceeded
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:71)
at java.io.DataOutputStream.writeByte(DataOutputStream.java:136)
at org.apache.hadoop.io.UTF8.writeChars(UTF8.java:278)
at org.apache.hadoop.io.UTF8.writeString(UTF8.java:250)
at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:131)
at org.apache.hadoop.ipc.RPC$Invocation.write(RPC.java:111)
at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:741)
at org.apache.hadoop.ipc.Client.call(Client.java:1011)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
at $Proxy6.getBlockLocations(Unknown Source)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy6.getBlockLocations(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:359)
at org.apache.hadoop.hdfs.DFSClient.getBlockLocations(DFSClient.java:380)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileBlockLocations(DistributedFileSystem.java:178)
at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:234)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:946)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:938)
at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:854)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:807)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:807)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:781)
at
org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:876)

Gagan Bansal

Reply via email to