We use the DistributedCache class to distribute a few lookup files for our jobs. We have been aggressively deleting failed task attempts' leftover data , and our script accidentally deleted the path to our distributed cache files.
Our task attempt leftover data was here [per node]: /hadoop/hadoop-metadata/cache/mapred/local/ and our distributed cache path was: hadoop/hadoop-metadata/cache/mapred/local/taskTracker/archive/<nameNode> We deleted this path by accident. Does this latter path look normal? I'm not that familiar with DistributedCache but I'm up right now investigating the issue so I thought I'd ask. After that deletion, the first 2 jobs to run (which are use the addCacheFile method to distribute their files) didn't seem to push the files out to the cache path, except on one node. Is this expected behavior? Shouldn't addCacheFile check to see if the files are missing, and if so, repopulate them as needed? I'm trying to get a handle on whether it's safe to delete the distributed cache path when the grid is quiet and no jobs are running. That is, if addCacheFile is designed to be robust against the files it's caching not being at each job start.