Couple of things you can try.
1. Increase the Heap Size for the tasks. 

2. Since, your OOM happening randomly, try setting 
-XX:+HeapDumpOnOutOfMemoryError for your child JVM parameters. Atleast you can 
detect, why your heap growing -is it due to a leak ? or if you need to increase 
the heap size for your mappers or reduces from this heap dump analysis.  

3. Other reason is due to poor JVM GC tuning. Sometimes, default can't catchup 
with the garbage created. This needs some GC tuning.  

-Bharath




From: common-user@hadoop.apache.org
To: cascading-u...@googlegroups.com; core-u...@hadoop.apache.org
Cc: 
Sent: Sunday, September 26, 2010 12:55:15 AM
Subject: java.lang.OutOfMemoryError: GC overhead limit exceeded

Greetings,

I'm running into a brain-numbing problem on Elastic MapReduce. I'm
running a decent-size task (22,000 mappers, a ton of GZipped input
blocks, ~1TB of data) on 40 c1.xlarge nodes (7 gb RAM, ~8 "cores").

I get failures randomly --- sometimes at the end of my 6-step process,
sometimes at the first reducer phase, sometimes in the mapper. It
seems to fail in multiple areas. Mostly in the reducers. Any ideas?

Here's the settings I've changed:
-Xmx400m
6 max mappers
1 max reducer
1GB swap partition
mapred.job.reuse.jvm.num.tasks=50
mapred.reduce.parallel.copies=3


java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.nio.CharBuffer.wrap(CharBuffer.java:350)
    at java.nio.CharBuffer.wrap(CharBuffer.java:373)
    at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:138)
    at java.lang.StringCoding.decode(StringCoding.java:173)
    at java.lang.String.(String.java:443)
    at java.lang.String.(String.java:515)
    at org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:116)
    at cascading.tuple.TupleInputStream.readString(TupleInputStream.java:144)
    at cascading.tuple.TupleInputStream.readType(TupleInputStream.java:154)
    at 
cascading.tuple.TupleInputStream.getNextElement(TupleInputStream.java:101)
    at 
cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:75)
    at 
cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:33)
    at 
cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:74)
    at 
cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:34)
    at 
cascading.tuple.hadoop.DeserializerComparator.compareTuples(DeserializerComparator.java:142)
    at 
cascading.tuple.hadoop.GroupingSortingComparator.compare(GroupingSortingComparator.java:55)
    at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
    at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:136)
    at org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103)
    at 
org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335)
    at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
    at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156)
    at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2645)
    at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2586)

-- 
Bradford Stephens,
Founder, Drawn to Scale
drawntoscalehq.com
727.697.7528

http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
solution. Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


      

Reply via email to