GitHub user rshest opened a pull request:
https://github.com/apache/spark/pull/19154
Fix DiskBlockManager crashing when a root local folder has been externally
deleted
## What changes were proposed in this pull request?
**The problem:**
`DiskBlockManager has a notion of "scratch" local folder(s), which can be
configured via `spark.local.dir` option, and which defaults to the system's
`/tmp`. The hierarchy is two-level, e.g. `/blockmgr-XXX.../YY`, where the `YY`
part is a hash bit, to spread files evenly.
Function `DiskBlockManager.getFile` _expects_ the top level directories
(`blockmgr-XXX...`) to always exist (they get created once, when the spark
context is first created), otherwise it would fail with message like:
```
... java.io.IOException: Failed to create local dir in
/tmp/blockmgr-XXX.../YY
```
However, this may not always be the case, in particular if it's the default
`/tmp` folder - in this case, on certain operating systems, it can be cleaned
on a regular basis (e.g. once per day via a system cron job).
The symptom is that after the process using spark is running for a while (a
few days), it may not be able to load files anymore, since the scratch
directories are not there and `DiskBlockManager.getFile` crashes.
The change/mitigation is simple: use `File.mkdirs` instead of `File.mkdir`
inside `getFile`, so that we create the _full path_ there, which will handle
the case that parent directory is not there anymore.
## How was this patch tested?
I have added a falsifying unit test inside `DiskBlockManagerSuite`, which
gets fixed via this patch.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rshest/spark
fix-DiskBlockManager-local-root-removed
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19154.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19154
commit dc502493c8c5cde03ba4dc1ce8391e176c583267
Author: Ruslan Shestopalyuk <rushb...@gmail.com>
Date: 2017-09-06T15:24:43Z
Fix DiskBlockManager crashing when root local folder has been removed
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org