Devaraj K created MAPREDUCE-4352:
------------------------------------
Summary: Jobs fail during resource localization when directories
in file cache reaches to unix directory limit
Key: MAPREDUCE-4352
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4352
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: nodemanager
Affects Versions: 2.0.1-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
If we have multiple jobs which uses distributed cache with small size of files,
the directory limit reaches before reaching the cache size and fails to create
any directories in file cache. The jobs start failing with the below exception.
{code:xml}
java.io.IOException: mkdir of
/tmp/nm-local-dir/usercache/root/filecache/1701886847734194975 failed
at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
at
org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
at
org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{code}
We should have a mechanism to clean the cache files if it crosses specified
number of directories like cache size.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira