GitHub user squito opened a pull request: https://github.com/apache/spark/pull/22511
[SPARK-25422][CORE] Don't memory map blocks streamed to disk. After data has been streamed to disk, the buffers are inserted into the memory store in some cases (eg., with broadcast blocks). But broadcast code also disposes of those buffers when the data has been read, to ensure that we don't leave mapped buffers using up memory, which then leads to garbage data in the memory store. ## How was this patch tested? Ran the old failing test in a loop. Full tests on jenkins You can merge this pull request into a Git repository by running: $ git pull https://github.com/squito/spark SPARK-25422 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22511.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22511 ---- commit aee82abe4cd9fbefa14fb280644276fe491bcf9a Author: Imran Rashid <irashid@...> Date: 2018-09-20T19:50:06Z [SPARK-25422][CORE] Don't memory map blocks streamed to disk. After data has been streamed to disk, the buffers are inserted into the memory store in some cases (eg., with broadcast blocks). But broadcast code also disposes of those buffers when the data has been read, to ensure that we don't leave mapped buffers using up memory, which then leads to garbage data in the memory store. ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org