[ https://issues.apache.org/jira/browse/SPARK-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050005#comment-14050005 ]
Guoqiang Li commented on SPARK-1989: ------------------------------------ In this case should also triggers the driver garbage collection. The related work: https://github.com/witgo/spark/compare/taskEvent > Exit executors faster if they get into a cycle of heavy GC > ---------------------------------------------------------- > > Key: SPARK-1989 > URL: https://issues.apache.org/jira/browse/SPARK-1989 > Project: Spark > Issue Type: New Feature > Components: Spark Core > Reporter: Matei Zaharia > Fix For: 1.1.0 > > > I've seen situations where an application is allocating too much memory > across its tasks + cache to proceed, but Java gets into a cycle where it > repeatedly runs full GCs, frees up a bit of the heap, and continues instead > of giving up. This then leads to timeouts and confusing error messages. It > would be better to crash with OOM sooner. The JVM has options to support > this: http://java.dzone.com/articles/tracking-excessive-garbage. > The right solution would probably be: > - Add some config options used by spark-submit to set XX:GCTimeLimit and > XX:GCHeapFreeLimit, with more conservative values than the defaults (e.g. 90% > time limit, 5% free limit) > - Make sure we pass these into the Java options for executors in each > deployment mode -- This message was sent by Atlassian JIRA (v6.2#6252)