fwiw I run m2.xlarge slaves, using the default mappers/reducers (4/2 i think).
with swap --bootstrap-action s3://elasticmapreduce/bootstrap-actions/create-swap-file.rb --args "-E,/mnt/swap,1000" historically i'v run this property with no issues, but should probably re-research the gc setting (comments please) "mapred.child.java.opts", "-server -Xmx2000m -XX:+UseParallelOldGC" i haven't co-installed ganglia to look at utilization lately, but any more mappers than 4 or more than 2 reducers have always given me headaches. ckw On Sep 26, 2010, at 12:55 AM, Bradford Stephens wrote: > Greetings, > > I'm running into a brain-numbing problem on Elastic MapReduce. I'm > running a decent-size task (22,000 mappers, a ton of GZipped input > blocks, ~1TB of data) on 40 c1.xlarge nodes (7 gb RAM, ~8 "cores"). > > I get failures randomly --- sometimes at the end of my 6-step process, > sometimes at the first reducer phase, sometimes in the mapper. It > seems to fail in multiple areas. Mostly in the reducers. Any ideas? > > Here's the settings I've changed: > -Xmx400m > 6 max mappers > 1 max reducer > 1GB swap partition > mapred.job.reuse.jvm.num.tasks=50 > mapred.reduce.parallel.copies=3 > > > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.nio.CharBuffer.wrap(CharBuffer.java:350) > at java.nio.CharBuffer.wrap(CharBuffer.java:373) > at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:138) > at java.lang.StringCoding.decode(StringCoding.java:173) > at java.lang.String.(String.java:443) > at java.lang.String.(String.java:515) > at org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:116) > at > cascading.tuple.TupleInputStream.readString(TupleInputStream.java:144) > at cascading.tuple.TupleInputStream.readType(TupleInputStream.java:154) > at > cascading.tuple.TupleInputStream.getNextElement(TupleInputStream.java:101) > at > cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:75) > at > cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:33) > at > cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:74) > at > cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:34) > at > cascading.tuple.hadoop.DeserializerComparator.compareTuples(DeserializerComparator.java:142) > at > cascading.tuple.hadoop.GroupingSortingComparator.compare(GroupingSortingComparator.java:55) > at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373) > at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:136) > at > org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103) > at > org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335) > at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) > at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2645) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2586) > > -- > Bradford Stephens, > Founder, Drawn to Scale > drawntoscalehq.com > 727.697.7528 > > http://www.drawntoscalehq.com -- The intuitive, cloud-scale data > solution. Process, store, query, search, and serve all your data. > > http://www.roadtofailure.com -- The Fringes of Scalability, Social > Media, and Computer Science > > -- > You received this message because you are subscribed to the Google Groups > "cascading-user" group. > To post to this group, send email to cascading-u...@googlegroups.com. > To unsubscribe from this group, send email to > cascading-user+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/cascading-user?hl=en. > -- Chris K Wensel ch...@concurrentinc.com http://www.concurrentinc.com -- Concurrent, Inc. offers mentoring, support, and licensing for Cascading