GitHub user sitalkedia opened a pull request: https://github.com/apache/spark/pull/12285
[SPARK-14363] Fix executor OOM due to memory leak in the Sorter ## What changes were proposed in this pull request? Fix memory leak in the Sorter. When the UnsafeExternalSorter spills the data to disk, it does not free up the underlying pointer array. As a result, we see a lot of executor OOM and also memory under utilization. This is a regression partially introduced in PR https://github.com/apache/spark/pull/9241 ## How was this patch tested? Tested by running a job and observed around 30% speedup after this change. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sitalkedia/spark executor_oom Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12285.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12285 ---- commit 69ca097aba076013a0b4e1f25f8da7b5568c4d55 Author: Sital Kedia <ske...@fb.com> Date: 2016-04-10T08:05:53Z [SPARK-14363] Fix executor OOM due to memory leak in the Sorter ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org