Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/15722 @davies - We fixed a similar issue with `UnsafeExternalSorter` in SPARK-14363. Basically following scenario is leading to OOM - Lets say we have total 4G of memory available that is shared across 4 tasks, so fair share of each task is around 1G. At some point of time few tasks finish and before the scheduler can schedule more task on the executor, current running tasks grab all the memory from the memory manager. The `LongArray` in that situation grows beyond the fair share of memory for those tasks. Later when scheduler schedules more task on this executors, already running tasks are forced to spill, but since they are not reseting the `LongArray`, this is resulting in OOM.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org