[ https://issues.apache.org/jira/browse/SPARK-43523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amine Bagdouri updated SPARK-43523: ----------------------------------- Description: We have a distributed Spark application running on Azure HDInsight using Spark version 2.4.4. After a few days of active processing on our application, we have noticed that the GC CPU time ratio of the driver is close to 100%. We suspected a memory leak. Thus, we have produced a heap dump and analyzed it using Eclipse Memory Analyzer. Here is some interesting data from the driver's heap dump (heap size is 8 GB): * The estimated retained heap size of String objects (~5M instances) is 3.3 GB. It seems that most of these instances correspond to spark events. * Spark UI's AppStatusListener instance estimated retained size is 1.1 GB. * The number of LiveJob objects with status "RUNNING" is 18K, knowing that there shouldn't be more than 16 live running jobs since we use a fixed size thread pool of 16 threads to run spark queries. * The number of LiveTask objects is 485K. * The AsyncEventQueue instance associated to the AppStatusListener has a value of 854 for dropped events count and a value of 10001 for total events count, knowing that the dropped events counter is reset every minute and that the queue's default capacity is 10000. We think that there is a memory leak in Spark UI. Here is our analysis of the root cause of this leak: * AppStatusListener is notified of Spark events using a bounded queue in AsyncEventQueue. * AppStatusListener updates its state (kvstore, liveTasks, liveStages, liveJobs, ...) based on the received events. For example, onTaskStart adds a task to liveTasks map and onTaskEnd removes the task from liveTasks map. * When the rate of events is very high, the bounded queue in AsyncEventQueue is full, some events are dropped and don't make it to AppStatusListener. * Dropped events that signal the end of a processing unit prevent the state of AppStatusListener from being cleaned. For example, a dropped onTaskEnd event, will prevent the task from being removed from liveTasks map, and the task will remain in the heap until the driver's JVM is stopped. We were able to confirm our analysis by reducing the capacity of the AsyncEventQueue (spark.scheduler.listenerbus.eventqueue.capacity=10). After having launched many spark queries using this config, we observed that the number of active jobs in Spark UI increased rapidly and remained high even though all submitted queries have completed. We have also noticed that some executor task counters in Spark UI were negative, which confirms that AppStatusListener state does not accurately reflect the reality and that it can be a victim of event drops. Suggested fix: There are some limits today on the number of "dead" objects in AppStatusListener's maps (for example: spark.ui.retainedJobs). We suggest enforcing another configurable limit on the number of total objects in AppStatusListener's maps and kvstore. This should limit the leak in the case of high events rate, but AppStatusListener stats will remain inaccurate. was: We have a distributed Spark application running on Azure HDInsight using Spark version 2.4.4. After a few days of active processing on our application, we have noticed that the GC CPU time ratio of the driver is close to 100%. We suspected a memory leak. Thus, we have produced a heap dump and analyzed it using Eclipse Memory Analyzer. Here is some interesting data from the driver's heap dump (heap size is 8 GB): * The estimated retained heap size of String objects (~5M instances) is 3.3 GB. It seems that most of these instances correspond to spark events. * Spark UI's AppStatusListener instance estimated retained size is 1.1 GB. * The number of LiveJob objects with status "RUNNING" is 18K, knowing that there shouldn't be more than 16 live running jobs since we use a fixed thread pool of 16 threads to run spark queries. * The number of LiveTask objects is 485K. * The AsyncEventQueue instance associated to the AppStatusListener has a value of 854 for dropped events count and a value of 10001 for total events count, knowing that the dropped events counter is reset every minute and that the queue's default capacity is 10000. We think that there is a memory leak in Spark UI. Here is our analysis of the root cause of this leak: * AppStatusListener is notified of Spark events using a bounded queue in AsyncEventQueue. * AppStatusListener updates its state (kvstore, liveTasks, liveStages, liveJobs, ...) based on the received events. For example, onTaskStart adds a task to liveTasks map and onTaskEnd removes the task from liveTasks map. * When the rate of events is very high, the bounded queue in AsyncEventQueue is full, some events are dropped and don't make it to AppStatusListener. * Dropped events that signal the end of a processing unit prevent the state of AppStatusListener from being cleaned. For example, a dropped onTaskEnd event, will prevent the task from being removed from liveTasks map, and the task will remain in the heap until the driver's JVM is stopped. We were able to confirm our analysis by reducing the capacity of the AsyncEventQueue (spark.scheduler.listenerbus.eventqueue.capacity=10). After having launched many spark queries using this config, we observed that the number of active jobs in Spark UI increased rapidly and remained high even though all submitted queries have completed. We have also noticed that some executor task counters in Spark UI were negative, which confirms that AppStatusListener state does not accurately reflect the reality and that it can be a victim of event drops. Suggested fix: There are some limits today on the number of "dead" objects in AppStatusListener's maps (for example: spark.ui.retainedJobs). We suggest enforcing another configurable limit on the number of total objects in AppStatusListener's maps and kvstore. This should limit the leak in the case of high events rate, but AppStatusListener stats will remain inaccurate. > Memory leak in Spark UI > ----------------------- > > Key: SPARK-43523 > URL: https://issues.apache.org/jira/browse/SPARK-43523 > Project: Spark > Issue Type: Bug > Components: Web UI > Affects Versions: 2.4.4 > Reporter: Amine Bagdouri > Priority: Major > > We have a distributed Spark application running on Azure HDInsight using > Spark version 2.4.4. > After a few days of active processing on our application, we have noticed > that the GC CPU time ratio of the driver is close to 100%. We suspected a > memory leak. Thus, we have produced a heap dump and analyzed it using Eclipse > Memory Analyzer. > Here is some interesting data from the driver's heap dump (heap size is 8 GB): > * The estimated retained heap size of String objects (~5M instances) is 3.3 > GB. It seems that most of these instances correspond to spark events. > * Spark UI's AppStatusListener instance estimated retained size is 1.1 GB. > * The number of LiveJob objects with status "RUNNING" is 18K, knowing that > there shouldn't be more than 16 live running jobs since we use a fixed size > thread pool of 16 threads to run spark queries. > * The number of LiveTask objects is 485K. > * The AsyncEventQueue instance associated to the AppStatusListener has a > value of 854 for dropped events count and a value of 10001 for total events > count, knowing that the dropped events counter is reset every minute and that > the queue's default capacity is 10000. > We think that there is a memory leak in Spark UI. Here is our analysis of the > root cause of this leak: > * AppStatusListener is notified of Spark events using a bounded queue in > AsyncEventQueue. > * AppStatusListener updates its state (kvstore, liveTasks, liveStages, > liveJobs, ...) based on the received events. For example, onTaskStart adds a > task to liveTasks map and onTaskEnd removes the task from liveTasks map. > * When the rate of events is very high, the bounded queue in AsyncEventQueue > is full, some events are dropped and don't make it to AppStatusListener. > * Dropped events that signal the end of a processing unit prevent the state > of AppStatusListener from being cleaned. For example, a dropped onTaskEnd > event, will prevent the task from being removed from liveTasks map, and the > task will remain in the heap until the driver's JVM is stopped. > We were able to confirm our analysis by reducing the capacity of the > AsyncEventQueue (spark.scheduler.listenerbus.eventqueue.capacity=10). After > having launched many spark queries using this config, we observed that the > number of active jobs in Spark UI increased rapidly and remained high even > though all submitted queries have completed. We have also noticed that some > executor task counters in Spark UI were negative, which confirms that > AppStatusListener state does not accurately reflect the reality and that it > can be a victim of event drops. > Suggested fix: > There are some limits today on the number of "dead" objects in > AppStatusListener's maps (for example: spark.ui.retainedJobs). We suggest > enforcing another configurable limit on the number of total objects in > AppStatusListener's maps and kvstore. This should limit the leak in the case > of high events rate, but AppStatusListener stats will remain inaccurate. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org