HDFS and long-running processes

2009-07-02 Thread David B. Ritch
I have been told that it is not a good idea to keep HDFS files open for a long time. The reason sounded like a memory leak in the name node - that over time, the resources absorbed by an open file will increase. Is this still an issue with Hadoop-0,19.x and 0-20.x? Was it ever an issue? I have

HDFS and long-running processes

2009-07-03 Thread David B. Ritch
I have been told that it is not a good idea to keep HDFS files open for a long time. The reason sounded like a memory leak in the name node - that over time, the resources absorbed by an open file will increase. Is this still an issue with Hadoop-0,19.x and 0-20.x? Was it ever an issue? I have

Re: HDFS and long-running processes

2009-07-03 Thread Todd Lipcon
Hi David, I'm unaware of any issue that would cause memory leaks when a file is open for read for a long time. There are some issues currently with write pipeline recovery when a file is open for writing for a long time and the datanodes to which it's writing fail. So, I would not recommend havin

Re: HDFS and long-running processes

2009-07-04 Thread David B. Ritch
Thanks, Todd. Perhaps I was misinformed, or misunderstood. I'll make sure I close files occasionally, but it's good to know that the only real issue is with data recovery after losing a node. David On 7/3/2009 3:08 PM, Todd Lipcon wrote: > Hi David, > > I'm unaware of any issue that would cause

Re: HDFS and long-running processes

2009-07-06 Thread Todd Lipcon
On Sat, Jul 4, 2009 at 9:08 AM, David B. Ritch wrote: > Thanks, Todd. Perhaps I was misinformed, or misunderstood. I'll make > sure I close files occasionally, but it's good to know that the only > real issue is with data recovery after losing a node. > Just to be clear, there aren't issues wit

Re: HDFS and long-running processes

2009-07-06 Thread stack
On Fri, Jul 3, 2009 at 11:20 AM, David B. Ritch wrote: > I have been told that it is not a good idea to keep HDFS files open for > a long time. The reason sounded like a memory leak in the name node - > that over time, the resources absorbed by an open file will increase. > > Is this still an iss

Re: HDFS and long-running processes

2009-07-21 Thread Steve Loughran
Todd Lipcon wrote: On Sat, Jul 4, 2009 at 9:08 AM, David B. Ritch wrote: Thanks, Todd. Perhaps I was misinformed, or misunderstood. I'll make sure I close files occasionally, but it's good to know that the only real issue is with data recovery after losing a node. Just to be clear, there

Re: HDFS and long-running processes

2009-07-21 Thread Todd Lipcon
On Tue, Jul 21, 2009 at 3:26 AM, Steve Loughran wrote: > Todd Lipcon wrote: > > On Sat, Jul 4, 2009 at 9:08 AM, David B. Ritch > >wrote: >> >> Thanks, Todd. Perhaps I was misinformed, or misunderstood. I'll make >>> sure I close files occasionally, but it's good to know that the only >>> real