This seems almost equivalent to a heap size error -- since GCs are stop-the-world events, the fact that we were unable to release more than 2% of the heap suggests that almost all the memory is *currently in use *(i.e., live).
Decreasing the number of cores is another solution which decreases memory pressure, because each core requires its own set of buffers (for instance, each kryo serializer has a certain buffer allocated to it), and has its own working set of data (some subset of a partition). Thus, decreasing the number of used cores decreases memory contention. On Tue, Jul 8, 2014 at 10:44 AM, Jerry Lam <chiling...@gmail.com> wrote: > Hi Konstantin, > > I just ran into the same problem. I mitigated the issue by reducing the > number of cores when I executed the job which otherwise it won't be able to > finish. > > Unlike many people believes, it might not means that you were running out > of memory. A better answer can be found here: > http://stackoverflow.com/questions/4371505/gc-overhead-limit-exceeded and > copied here as a reference: > > "Excessive GC Time and OutOfMemoryError > > The concurrent collector will throw an OutOfMemoryError if too much time > is being spent in garbage collection: if more than 98% of the total time is > spent in garbage collection and less than 2% of the heap is recovered, an > OutOfMemoryError will be thrown. This feature is designed to prevent > applications from running for an extended period of time while making > little or no progress because the heap is too small. If necessary, this > feature can be disabled by adding the option -XX:-UseGCOverheadLimit to the > command line. > > The policy is the same as that in the parallel collector, except that time > spent performing concurrent collections is not counted toward the 98% time > limit. In other words, only collections performed while the application is > stopped count toward excessive GC time. Such collections are typically due > to a concurrent mode failure or an explicit collection request (e.g., a > call to System.gc())." > > It could be that there are many tasks running in the same node and they > all compete for running GCs which slow things down and trigger the error > you saw. By reducing the number of cores, there are more cpu resources > available to a task so the GC could finish before the error gets throw. > > HTH, > > Jerry > > > On Tue, Jul 8, 2014 at 1:35 PM, Aaron Davidson <ilike...@gmail.com> wrote: > >> There is a difference from actual GC overhead, which can be reduced by >> reusing objects, versus this error, which actually means you ran out of >> memory. This error can probably be relieved by increasing your executor >> heap size, unless your data is corrupt and it is allocating huge arrays, or >> you are otherwise keeping too much memory around. >> >> For your other question, you can reuse objects similar to MapReduce >> (HadoopRDD does this by actually using Hadoop's Writables, for instance), >> but the general Spark APIs don't support this because mutable objects are >> not friendly to caching or serializing. >> >> >> On Tue, Jul 8, 2014 at 9:27 AM, Konstantin Kudryavtsev < >> kudryavtsev.konstan...@gmail.com> wrote: >> >>> Hi all, >>> >>> I faced with the next exception during map step: >>> java.lang.OutOfMemoryError (java.lang.OutOfMemoryError: GC overhead >>> limit exceeded) >>> java.lang.reflect.Array.newInstance(Array.java:70) >>> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:325) >>> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293) >>> com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:699) >>> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611) >>> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) >>> com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648) >>> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) >>> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) >>> com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648) >>> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) >>> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) >>> com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648) >>> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) >>> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) >>> com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:699) >>> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611) >>> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) >>> com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) >>> com.twitter.chill.Tuple2Serializer.read(TupleSerializers.scala:43) >>> com.twitter.chill.Tuple2Serializer.read(TupleSerializers.scala:34) >>> com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) >>> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:115) >>> org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:125) >>> org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71) >>> scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) >>> org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:30) >>> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) >>> org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$4.apply(CoGroupedRDD.scala:155) >>> org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$4.apply(CoGroupedRDD.scala:154) >>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >>> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >>> I'm using Spark 1.0In map I create new object each time, as I >>> understand I can't reuse object similar to MapReduce development? I >>> wondered, if you could point me how is it possible to avoid GC >>> overhead...thank >>> you in advance >>> >>> Thank you, >>> Konstantin Kudryavtsev >>> >> >> >