Hi Jason
LiveUI initializes ElementTrackingStore with InMemoryStore, so it has OOM risk.

/**
 * Create an in-memory store for a live application.
 */
def createLiveStore(
    conf: SparkConf,
    appStatusSource: Option[AppStatusSource] = None): AppStatusStore = {
  val store = new ElementTrackingStore(new InMemoryStore(), conf)
  val listener = new AppStatusListener(store, conf, true, appStatusSource)
  new AppStatusStore(store, listener = Some(listener))
}
In addition to the parameters you mentioned, you can try to reduce the 
following parameters:
* spark.ui.retainedTasks
* spark.ui.dagGraph.retainedRootRDDs

If you have more information about this situation, it would be good.

Best
Qian


> 2022年8月3日 上午11:04,Jason Jun <jaes...@gmail.com> 写道:
> 
> He there,
> 
> We have spark driver running 24x7, and we are continiously getting OOM in 
> spark driver every 10 days.
> I found org.apache.spark.status.ElementTrackingStore keep 85% of heap usage 
> after analyzing heap dump like this image:
> <image.png>
> 
> i found these parameter would be the root cause in jira ticket, 
> https://issues.apache.org/jira/browse/SPARK-26395 
> <https://issues.apache.org/jira/browse/SPARK-26395>
> spark.ui.retainedDeadExecutors
> spark.ui.retainedJobs
> spark.ui.retainedStages
> 
> But it didn't work. OOM is delayed from 1 week to 10 days with these changes.
> 
> It would be really appreciated if anyone can give me any solutions.
> 
> Thanks
> Jason
> 
> .

Reply via email to