Thanks, Todd.  Perhaps I was misinformed, or misunderstood.  I'll make
sure I close files occasionally, but it's good to know that the only
real issue is with data recovery after losing a node.

David

On 7/3/2009 3:08 PM, Todd Lipcon wrote:
> Hi David,
>
> I'm unaware of any issue that would cause memory leaks when a file is open
> for read for a long time.
>
> There are some issues currently with write pipeline recovery when a file is
> open for writing for a long time and the datanodes to which it's writing
> fail. So, I would not recommend having a file open for write for longer than
> several hours (depending on the frequency with which you expect failures).
>
> -Todd
>
> On Fri, Jul 3, 2009 at 11:20 AM, David B. Ritch <david.ri...@gmail.com>wrote:
>
>   
>> I have been told that it is not a good idea to keep HDFS files open for
>> a long time.  The reason sounded like a memory leak in the name node -
>> that over time, the resources absorbed by an open file will increase.
>>
>> Is this still an issue  with Hadoop-0,19.x and 0-20.x?  Was it ever an
>> issue?
>>
>> I have an application that keeps a number of files open, and executes
>> pseudo-random reads from them in response to externally generated
>> queries.  Should it close and re-open active files that are open longer
>> than a certain amount of time?  If so, how long is too long to keep a
>> file open?  And why?
>>
>> Thanks!
>>
>> David
>>
>>     
>
>   

Reply via email to