Hi,

"[..]if more than 98% of the total time is spent in garbage collection
and less than 2% of the heap is recovered, an OutOfMemoryError will be
thrown. This feature is designed to prevent applications from running
for an extended period of time while making little or no progress
because the heap is too small. If necessary, this feature can be
disabled by adding the option -XX:-UseGCOverheadLimit to the command
line."

This is what often happens in MapReduce operations when u process a lot of data.
I recommend to try
<property>
  <name>mapred.child.java.opts</name>
   <value>-Xmx1024m -XX:-UseGCOverheadLimit</value>
</property>


also from my personal experience when process a lot of data often it
is much cheaper to kill JVM rather than wait for GC.
For that reason if you have a lot of BIG tasks rather than tons of
small tasks do not reuse JVM, killing JVM and starting it again often
much cheaper than trying to GC 1GB of ram(don't know why, it just
tuned out in my tests).
<property>
  <name>mapred.job.reuse.jvm.num.tasks</name>
  <value>1</value>
</description>

Regards,
Vitaliy S

On Sun, Sep 26, 2010 at 11:55 AM, Bradford Stephens
<bradfordsteph...@gmail.com> wrote:
> Greetings,
>
> I'm running into a brain-numbing problem on Elastic MapReduce. I'm
> running a decent-size task (22,000 mappers, a ton of GZipped input
> blocks, ~1TB of data) on 40 c1.xlarge nodes (7 gb RAM, ~8 "cores").
>
> I get failures randomly --- sometimes at the end of my 6-step process,
> sometimes at the first reducer phase, sometimes in the mapper. It
> seems to fail in multiple areas. Mostly in the reducers. Any ideas?
>
> Here's the settings I've changed:
> -Xmx400m
> 6 max mappers
> 1 max reducer
> 1GB swap partition
> mapred.job.reuse.jvm.num.tasks=50
> mapred.reduce.parallel.copies=3
>
>
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>        at java.nio.CharBuffer.wrap(CharBuffer.java:350)
>        at java.nio.CharBuffer.wrap(CharBuffer.java:373)
>        at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:138)
>        at java.lang.StringCoding.decode(StringCoding.java:173)
>        at java.lang.String.(String.java:443)
>        at java.lang.String.(String.java:515)
>        at 
> org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:116)
>        at 
> cascading.tuple.TupleInputStream.readString(TupleInputStream.java:144)
>        at cascading.tuple.TupleInputStream.readType(TupleInputStream.java:154)
>        at 
> cascading.tuple.TupleInputStream.getNextElement(TupleInputStream.java:101)
>        at 
> cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:75)
>        at 
> cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:33)
>        at 
> cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:74)
>        at 
> cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:34)
>        at 
> cascading.tuple.hadoop.DeserializerComparator.compareTuples(DeserializerComparator.java:142)
>        at 
> cascading.tuple.hadoop.GroupingSortingComparator.compare(GroupingSortingComparator.java:55)
>        at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
>        at 
> org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:136)
>        at 
> org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103)
>        at 
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335)
>        at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
>        at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156)
>        at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2645)
>        at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2586)
>
> --
> Bradford Stephens,
> Founder, Drawn to Scale
> drawntoscalehq.com
> 727.697.7528
>
> http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
> solution. Process, store, query, search, and serve all your data.
>
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science
>

Reply via email to