[ 
https://issues.apache.org/jira/browse/SPARK-19814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon King closed SPARK-19814.
------------------------------
    Resolution: Duplicate

Looks like it's wrong to characterize this as a bug -- couldn't identify an 
actual memory leak. Seems more like we'll have to wait for the major overhaul 
proposed by https://issues.apache.org/jira/browse/SPARK-18085 

> Spark History Server Out Of Memory / Extreme GC
> -----------------------------------------------
>
>                 Key: SPARK-19814
>                 URL: https://issues.apache.org/jira/browse/SPARK-19814
>             Project: Spark
>          Issue Type: Bug
>          Components: Web UI
>    Affects Versions: 1.6.1, 2.0.0, 2.1.0
>         Environment: Spark History Server (we've run it on several different 
> Hadoop distributions)
>            Reporter: Simon King
>         Attachments: SparkHistoryCPUandRAM.png
>
>
> Spark History Server runs out of memory, gets into GC thrash and eventually 
> becomes unresponsive. This seems to happen more quickly with heavy use of the 
> REST API. We've seen this with several versions of Spark. 
> Running with the following settings (spark 2.1):
> spark.history.fs.cleaner.enabled    true
> spark.history.fs.cleaner.interval   1d
> spark.history.fs.cleaner.maxAge     7d
> spark.history.retainedApplications  500
> We will eventually get errors like:
> 17/02/25 05:02:19 WARN ServletHandler:ยท
> javax.servlet.ServletException: scala.MatchError: java.lang.OutOfMemoryError: 
> GC overhead limit exceeded (of class java.lang.OutOfMemoryError)
>   at 
> org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:489)
>   at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
>   at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
>   at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
>   at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
>   at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:812)
>   at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587)
>   at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
>   at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>   at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
>   at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.spark_project.jetty.servlets.gzip.GzipHandler.handle(GzipHandler.java:529)
>   at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
>   at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
>   at org.spark_project.jetty.server.Server.handle(Server.java:499)
>   at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:311)
>   at 
> org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
>   at 
> org.spark_project.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
>   at 
> org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
>   at 
> org.spark_project.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: scala.MatchError: java.lang.OutOfMemoryError: GC overhead limit 
> exceeded (of class java.lang.OutOfMemoryError)
>   at 
> org.apache.spark.deploy.history.ApplicationCache.getSparkUI(ApplicationCache.scala:148)
>   at 
> org.apache.spark.deploy.history.HistoryServer.getSparkUI(HistoryServer.scala:110)
>   at 
> org.apache.spark.status.api.v1.UIRoot$class.withSparkUI(ApiRootResource.scala:244)
>   at 
> org.apache.spark.deploy.history.HistoryServer.withSparkUI(HistoryServer.scala:49)
>   at 
> org.apache.spark.status.api.v1.ApiRootResource.getJobs(ApiRootResource.scala:66)
>   at sun.reflect.GeneratedMethodAccessor102.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.glassfish.jersey.server.internal.routing.SubResourceLocatorRouter$1.run(SubResourceLocatorRouter.java:158)
>   at 
> org.glassfish.jersey.server.internal.routing.SubResourceLocatorRouter.getResource(SubResourceLocatorRouter.java:178)
>   at 
> org.glassfish.jersey.server.internal.routing.SubResourceLocatorRouter.apply(SubResourceLocatorRouter.java:109)
>   at 
> org.glassfish.jersey.server.internal.routing.RoutingStage._apply(RoutingStage.java:109)
>   at 
> org.glassfish.jersey.server.internal.routing.RoutingStage._apply(RoutingStage.java:112)
>   at 
> org.glassfish.jersey.server.internal.routing.RoutingStage._apply(RoutingStage.java:112)
>   at 
> org.glassfish.jersey.server.internal.routing.RoutingStage._apply(RoutingStage.java:112)
>   at 
> org.glassfish.jersey.server.internal.routing.RoutingStage._apply(RoutingStage.java:112)
>   at 
> org.glassfish.jersey.server.internal.routing.RoutingStage.apply(RoutingStage.java:92)
>   at 
> org.glassfish.jersey.server.internal.routing.RoutingStage.apply(RoutingStage.java:61)
>   at org.glassfish.jersey.process.internal.Stages.process(Stages.java:197)
>   at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:318)
>   at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
>   at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
>   at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
>   at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
>   at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
>   at 
> org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
>   at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
>   at 
> org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
>   at 
> org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
> In our case we see memory usage gradually increase over perhaps 2 days, then 
> level off near max heap size (4G in our case), then often within 12-24 hours 
> GC activity will start to increase, and will result in more and more frequent 
> errors, as in the stack trace above.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to