[ https://issues.apache.org/jira/browse/SPARK-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262761#comment-15262761 ]
Thomas Graves commented on SPARK-1989: -------------------------------------- Personally I don't agree with this and think we should close this as won't fix. There is some discussion on the pr above and here was my last response. If others disagree please let me know. ==== I understand the argument of we want the best user experience and I'm not against the settings themselves, I just think the benefit isn't worth the cost here. These are very specific advanced java options and properly maintaining and parsing them to me is not a necessary thing. For instance when java 9,10,11 come out and the options no longer exist or change we have to go change code, if ibm java comes out with different config we have to change, if someone thinks 80% is better then 90% we have to change. We already have enough PRs. Let the user/admins configure it for their version of java and specific needs. We are adding a bunch of code to parse these and set them to a default that someone thinks is better. Many others might disagree. For instance with MapReduce we run it at 50% to fail fast. Why not set spark to that? if we want it to fail fast 50% is better then 90, right? Why don't we set the garbage collector as well? To me this all comes down to configuring what is best for your specific application. Since Spark can do so many different things - streaming, ML, graph processing, ETL, having one default isn't necessarily best for all. I think putting this in sets a bad precedence and just adds maintenance headache for not much benefit. @vanzin mentions he has never seen anyone set this, so is it that big of a deal? Where is the data that says 90% is better then 98% for the majority of Spark users. Obviously if things just don't run like you mention with the max perm size, that makes it a much easier call and it makes sense to put it in, but I don't see that here. Many of my customers don't set it and things are fine. I see other users set it because they explicitly want to fail very fast and its less then 90%. I also think setting XX:GCHeapFreeLimit is more risky then setting GCTimeLimit. I personally have never seen anyone actually set this. its defined as "The lower limit on the amount of space freed during a garbage collection in percent of the maximum heap (default is 2)" This to me is much more application specific then the GC time limit. > Exit executors faster if they get into a cycle of heavy GC > ---------------------------------------------------------- > > Key: SPARK-1989 > URL: https://issues.apache.org/jira/browse/SPARK-1989 > Project: Spark > Issue Type: New Feature > Components: Spark Core > Reporter: Matei Zaharia > > I've seen situations where an application is allocating too much memory > across its tasks + cache to proceed, but Java gets into a cycle where it > repeatedly runs full GCs, frees up a bit of the heap, and continues instead > of giving up. This then leads to timeouts and confusing error messages. It > would be better to crash with OOM sooner. The JVM has options to support > this: http://java.dzone.com/articles/tracking-excessive-garbage. > The right solution would probably be: > - Add some config options used by spark-submit to set XX:GCTimeLimit and > XX:GCHeapFreeLimit, with more conservative values than the defaults (e.g. 90% > time limit, 5% free limit) > - Make sure we pass these into the Java options for executors in each > deployment mode -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org