[ https://issues.apache.org/jira/browse/SPARK-26513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-26513. ------------------------------- Resolution: Won't Fix > Trigger GC on executor node idle > -------------------------------- > > Key: SPARK-26513 > URL: https://issues.apache.org/jira/browse/SPARK-26513 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 2.3.0 > Reporter: Sandish Kumar HN > Priority: Major > > > Correct me if I'm wrong. > *Stage:* > On a large cluster, each stage would have some executors. were a few > executors would finish a couple of tasks first and wait for whole stage or > remaining tasks to finish which are executed by different executors nodes in > a cluster. a stage will only be completed when all tasks in a current stage > finish its execution. and the next stage execution has to wait till all tasks > of the current stage are completed. > > why don't we trigger GC, when the executor node is waiting for remaining > tasks to finish, or executor Idle? anyways executor has to wait for the > remaining tasks to finish which can at least take a couple of seconds. why > don't we trigger GC? which will max take <300ms > > I have proposed a small code snippet which triggers GC when running tasks are > empty and heap usage in current executor node is more than the given > threshold. > This could improve performance for long-running spark job's. > we referred this paper > [https://www.computer.org/csdl/proceedings/hipc/2016/5411/00/07839705.pdf] > and we found performance improvements in our long-running spark batch job's. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org