[jira] [Commented] (SPARK-19814) Spark History Server Out Of Memory / Extreme GC
[ https://issues.apache.org/jira/browse/SPARK-19814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15894952#comment-15894952 ] Sean Owen commented on SPARK-19814: --- Yes, that already describes further optimizations. I would close this as a duplicate at least if you're not showing a memory leak. > Spark History Server Out Of Memory / Extreme GC > --- > > Key: SPARK-19814 > URL: https://issues.apache.org/jira/browse/SPARK-19814 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 1.6.1, 2.0.0, 2.1.0 > Environment: Spark History Server (we've run it on several different > Hadoop distributions) >Reporter: Simon King > Attachments: SparkHistoryCPUandRAM.png > > > Spark History Server runs out of memory, gets into GC thrash and eventually > becomes unresponsive. This seems to happen more quickly with heavy use of the > REST API. We've seen this with several versions of Spark. > Running with the following settings (spark 2.1): > spark.history.fs.cleaner.enabledtrue > spark.history.fs.cleaner.interval 1d > spark.history.fs.cleaner.maxAge 7d > spark.history.retainedApplications 500 > We will eventually get errors like: > 17/02/25 05:02:19 WARN ServletHandler:· > javax.servlet.ServletException: scala.MatchError: java.lang.OutOfMemoryError: > GC overhead limit exceeded (of class java.lang.OutOfMemoryError) > at > org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:489) > at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427) > at > org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388) > at > org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341) > at > org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228) > at > org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:812) > at > org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587) > at > org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) > at > org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) > at > org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) > at > org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.spark_project.jetty.servlets.gzip.GzipHandler.handle(GzipHandler.java:529) > at > org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) > at > org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) > at org.spark_project.jetty.server.Server.handle(Server.java:499) > at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:311) > at > org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) > at > org.spark_project.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544) > at > org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) > at > org.spark_project.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) > at java.lang.Thread.run(Thread.java:745) > Caused by: scala.MatchError: java.lang.OutOfMemoryError: GC overhead limit > exceeded (of class java.lang.OutOfMemoryError) > at > org.apache.spark.deploy.history.ApplicationCache.getSparkUI(ApplicationCache.scala:148) > at > org.apache.spark.deploy.history.HistoryServer.getSparkUI(HistoryServer.scala:110) > at > org.apache.spark.status.api.v1.UIRoot$class.withSparkUI(ApiRootResource.scala:244) > at > org.apache.spark.deploy.history.HistoryServer.withSparkUI(HistoryServer.scala:49) > at > org.apache.spark.status.api.v1.ApiRootResource.getJobs(ApiRootResource.scala:66) > at sun.reflect.GeneratedMethodAccessor102.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.glassfish.jersey.server.internal.routing.SubResourceLocatorRouter$1.run(SubResourceLocatorRouter.java:158) > at > org.glassfish.jersey.server.internal.routing.SubResourceLocatorRouter.getResource(SubResourceLocatorRouter.java:178) > at > org.glassfish.jersey.server.internal.routing.SubResourceLocatorRouter.apply(SubResourceLocatorRouter.java:109) > at > org.glassfish.jersey.server.internal.routing.RoutingStage._apply(RoutingStage.java:109) > at > org.glassfish.jersey.server.internal.routing.RoutingStage._apply(RoutingStage.java:112) > at > org.glassfish.jersey.server.internal.routing.RoutingStage._apply(RoutingStage.java:112) > at >
[jira] [Commented] (SPARK-19814) Spark History Server Out Of Memory / Extreme GC
[ https://issues.apache.org/jira/browse/SPARK-19814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15894948#comment-15894948 ] Simon King commented on SPARK-19814: Sean, I think that giving more memory only delays the problem, but we will experiment more with larger heap settings. We're just starting to look into the issue, hoping for early help diagnosing or configuring around the issue. Hope there's a simpler fix than the major overhaul proposed here: https://issues.apache.org/jira/browse/SPARK-18085 > Spark History Server Out Of Memory / Extreme GC > --- > > Key: SPARK-19814 > URL: https://issues.apache.org/jira/browse/SPARK-19814 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 1.6.1, 2.0.0, 2.1.0 > Environment: Spark History Server (we've run it on several different > Hadoop distributions) >Reporter: Simon King > Attachments: SparkHistoryCPUandRAM.png > > > Spark History Server runs out of memory, gets into GC thrash and eventually > becomes unresponsive. This seems to happen more quickly with heavy use of the > REST API. We've seen this with several versions of Spark. > Running with the following settings (spark 2.1): > spark.history.fs.cleaner.enabledtrue > spark.history.fs.cleaner.interval 1d > spark.history.fs.cleaner.maxAge 7d > spark.history.retainedApplications 500 > We will eventually get errors like: > 17/02/25 05:02:19 WARN ServletHandler:· > javax.servlet.ServletException: scala.MatchError: java.lang.OutOfMemoryError: > GC overhead limit exceeded (of class java.lang.OutOfMemoryError) > at > org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:489) > at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427) > at > org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388) > at > org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341) > at > org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228) > at > org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:812) > at > org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587) > at > org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) > at > org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) > at > org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) > at > org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.spark_project.jetty.servlets.gzip.GzipHandler.handle(GzipHandler.java:529) > at > org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) > at > org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) > at org.spark_project.jetty.server.Server.handle(Server.java:499) > at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:311) > at > org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) > at > org.spark_project.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544) > at > org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) > at > org.spark_project.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) > at java.lang.Thread.run(Thread.java:745) > Caused by: scala.MatchError: java.lang.OutOfMemoryError: GC overhead limit > exceeded (of class java.lang.OutOfMemoryError) > at > org.apache.spark.deploy.history.ApplicationCache.getSparkUI(ApplicationCache.scala:148) > at > org.apache.spark.deploy.history.HistoryServer.getSparkUI(HistoryServer.scala:110) > at > org.apache.spark.status.api.v1.UIRoot$class.withSparkUI(ApiRootResource.scala:244) > at > org.apache.spark.deploy.history.HistoryServer.withSparkUI(HistoryServer.scala:49) > at > org.apache.spark.status.api.v1.ApiRootResource.getJobs(ApiRootResource.scala:66) > at sun.reflect.GeneratedMethodAccessor102.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.glassfish.jersey.server.internal.routing.SubResourceLocatorRouter$1.run(SubResourceLocatorRouter.java:158) > at > org.glassfish.jersey.server.internal.routing.SubResourceLocatorRouter.getResource(SubResourceLocatorRouter.java:178) > at > org.glassfish.jersey.server.internal.routing.SubResourceLocatorRouter.apply(SubResourceLocatorRouter.java:109) > at > org.glassfish.jersey.server.internal.routing.RoutingStage._apply(RoutingStage.java:109) > at >
[jira] [Commented] (SPARK-19814) Spark History Server Out Of Memory / Extreme GC
[ https://issues.apache.org/jira/browse/SPARK-19814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15894942#comment-15894942 ] Sean Owen commented on SPARK-19814: --- I'm not sure if this is a bug. It depends on how much memory you give, how much data the history server stores. 4G may not be enough; increase that? Unless it's a memory leak or some obviously too large data structure, I don't think it's a bug, but if you have a concrete optimization, you can open a pull request. > Spark History Server Out Of Memory / Extreme GC > --- > > Key: SPARK-19814 > URL: https://issues.apache.org/jira/browse/SPARK-19814 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 1.6.1, 2.0.0, 2.1.0 > Environment: Spark History Server (we've run it on several different > Hadoop distributions) >Reporter: Simon King > Attachments: SparkHistoryCPUandRAM.png > > > Spark History Server runs out of memory, gets into GC thrash and eventually > becomes unresponsive. This seems to happen more quickly with heavy use of the > REST API. We've seen this with several versions of Spark. > Running with the following settings (spark 2.1): > spark.history.fs.cleaner.enabledtrue > spark.history.fs.cleaner.interval 1d > spark.history.fs.cleaner.maxAge 7d > spark.history.retainedApplications 500 > We will eventually get errors like: > 17/02/25 05:02:19 WARN ServletHandler:· > javax.servlet.ServletException: scala.MatchError: java.lang.OutOfMemoryError: > GC overhead limit exceeded (of class java.lang.OutOfMemoryError) > at > org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:489) > at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427) > at > org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388) > at > org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341) > at > org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228) > at > org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:812) > at > org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587) > at > org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) > at > org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) > at > org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) > at > org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.spark_project.jetty.servlets.gzip.GzipHandler.handle(GzipHandler.java:529) > at > org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) > at > org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) > at org.spark_project.jetty.server.Server.handle(Server.java:499) > at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:311) > at > org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) > at > org.spark_project.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544) > at > org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) > at > org.spark_project.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) > at java.lang.Thread.run(Thread.java:745) > Caused by: scala.MatchError: java.lang.OutOfMemoryError: GC overhead limit > exceeded (of class java.lang.OutOfMemoryError) > at > org.apache.spark.deploy.history.ApplicationCache.getSparkUI(ApplicationCache.scala:148) > at > org.apache.spark.deploy.history.HistoryServer.getSparkUI(HistoryServer.scala:110) > at > org.apache.spark.status.api.v1.UIRoot$class.withSparkUI(ApiRootResource.scala:244) > at > org.apache.spark.deploy.history.HistoryServer.withSparkUI(HistoryServer.scala:49) > at > org.apache.spark.status.api.v1.ApiRootResource.getJobs(ApiRootResource.scala:66) > at sun.reflect.GeneratedMethodAccessor102.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.glassfish.jersey.server.internal.routing.SubResourceLocatorRouter$1.run(SubResourceLocatorRouter.java:158) > at > org.glassfish.jersey.server.internal.routing.SubResourceLocatorRouter.getResource(SubResourceLocatorRouter.java:178) > at > org.glassfish.jersey.server.internal.routing.SubResourceLocatorRouter.apply(SubResourceLocatorRouter.java:109) > at > org.glassfish.jersey.server.internal.routing.RoutingStage._apply(RoutingStage.java:109) > at > org.glassfish.jersey.server.internal.routing.RoutingStage._apply(RoutingStage.java:112) >