Hi David, I'm unaware of any issue that would cause memory leaks when a file is open for read for a long time.
There are some issues currently with write pipeline recovery when a file is open for writing for a long time and the datanodes to which it's writing fail. So, I would not recommend having a file open for write for longer than several hours (depending on the frequency with which you expect failures). -Todd On Fri, Jul 3, 2009 at 11:20 AM, David B. Ritch <david.ri...@gmail.com>wrote: > I have been told that it is not a good idea to keep HDFS files open for > a long time. The reason sounded like a memory leak in the name node - > that over time, the resources absorbed by an open file will increase. > > Is this still an issue with Hadoop-0,19.x and 0-20.x? Was it ever an > issue? > > I have an application that keeps a number of files open, and executes > pseudo-random reads from them in response to externally generated > queries. Should it close and re-open active files that are open longer > than a certain amount of time? If so, how long is too long to keep a > file open? And why? > > Thanks! > > David >