GitHub user rshest opened a pull request: https://github.com/apache/spark/pull/19154
Fix DiskBlockManager crashing when a root local folder has been externally deleted ## What changes were proposed in this pull request? **The problem:** `DiskBlockManager has a notion of "scratch" local folder(s), which can be configured via `spark.local.dir` option, and which defaults to the system's `/tmp`. The hierarchy is two-level, e.g. `/blockmgr-XXX.../YY`, where the `YY` part is a hash bit, to spread files evenly. Function `DiskBlockManager.getFile` _expects_ the top level directories (`blockmgr-XXX...`) to always exist (they get created once, when the spark context is first created), otherwise it would fail with message like: ``` ... java.io.IOException: Failed to create local dir in /tmp/blockmgr-XXX.../YY ``` However, this may not always be the case, in particular if it's the default `/tmp` folder - in this case, on certain operating systems, it can be cleaned on a regular basis (e.g. once per day via a system cron job). The symptom is that after the process using spark is running for a while (a few days), it may not be able to load files anymore, since the scratch directories are not there and `DiskBlockManager.getFile` crashes. The change/mitigation is simple: use `File.mkdirs` instead of `File.mkdir` inside `getFile`, so that we create the _full path_ there, which will handle the case that parent directory is not there anymore. ## How was this patch tested? I have added a falsifying unit test inside `DiskBlockManagerSuite`, which gets fixed via this patch. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rshest/spark fix-DiskBlockManager-local-root-removed Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19154.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19154 ---- commit dc502493c8c5cde03ba4dc1ce8391e176c583267 Author: Ruslan Shestopalyuk <rushb...@gmail.com> Date: 2017-09-06T15:24:43Z Fix DiskBlockManager crashing when root local folder has been removed ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org