FS Caching is enabled on the cluster (i.e. the default is not changed). Our code isn't actually mapper code, but a standalone java program being run as part of Oozie. It just seemed confusing and not a very clear strategy to leave unclosed resources. Hence my suggestion to get an uncached FS handle for this use case alone. Note, I am not suggesting to disable FS caching in general.
Thanks Hemanth On Thu, Jan 31, 2013 at 12:19 AM, Alejandro Abdelnur <t...@cloudera.com>wrote: > Hemanth, > > Is FS caching enabled or not in your cluster? > > A simple solution would be to modify your mapper code not to close the FS. > It will go away when the task ends anyway. > > Thx > > > On Thu, Jan 24, 2013 at 5:26 PM, Hemanth Yamijala < > yhema...@thoughtworks.com> wrote: > >> Hi, >> >> We are noticing a problem where we get a filesystem closed exception when >> a map task is done and is finishing execution. By map task, I literally >> mean the MapTask class of the map reduce code. Debugging this we found that >> the mapper is getting a handle to the filesystem object and itself calling >> a close on it. Because filesystem objects are cached, I believe the >> behaviour is as expected in terms of the exception. >> >> I just wanted to confirm that: >> >> - if we do have a requirement to use a filesystem object in a mapper or >> reducer, we should either not close it ourselves >> - or (seems better to me) ask for a new version of the filesystem >> instance by setting the fs.hdfs.impl.disable.cache property to true in job >> configuration. >> >> Also, does anyone know if this behaviour was any different in Hadoop 0.20 >> ? >> >> For some context, this behaviour is actually seen in Oozie, which runs a >> launcher mapper for a simple java action. Hence, the java action could very >> well interact with a file system. I know this is probably better addressed >> in Oozie context, but wanted to get the map reduce view of things. >> >> >> Thanks, >> Hemanth >> > > > > -- > Alejandro >