Hi Raghu,

The file doesn't appear in the cluster when I saw it from Namenode UI. Also, I 
have a monitor at cluster side which checks whether file is created and throws 
an exception when it is not created. And, it threw an exception saying "File 
not found".

Thanks
Pallavi
----- Original Message -----
From: "Raghu Angadi" <rang...@yahoo-inc.com>
To: common-user@hadoop.apache.org
Sent: Wednesday, August 12, 2009 12:10:12 AM GMT +05:30 Chennai, Kolkata, 
Mumbai, New Delhi
Subject: Re: File is closed but data is not visible


Your assumption is correct. When you close the file, others can read the 
data. There is no delay expected before the data is visible. If there is 
an error either write() or close() would throw an error.

When you say data is not visible do you mean readers can not see the 
file or can not see the data? Is it guaranteed that readers open the 
file _after_ close returns on the writer?

Raghu.

Palleti, Pallavi wrote:
> Hi Jason,
> 
> Apologies for missing version information in my previous mail. I am
> using hadoop-0.18.3. I am getting FSDataOutputStream object using
> fs.create(new Path(some_file_name)), where fs is FileSystem object. And,
> I am closing the file using close(). 
> 
> Thanks
> Pallavi
> 
> -----Original Message-----
> From: Jason Venner [mailto:jason.had...@gmail.com] 
> Sent: Tuesday, August 11, 2009 6:24 PM
> To: common-user@hadoop.apache.org
> Subject: Re: File is closed but data is not visible
> 
> Please provide information on what version of hadoop you are using and
> the
> method of opening and closing the file.
> 
> 
> On Tue, Aug 11, 2009 at 12:48 AM, Pallavi Palleti <
> pallavi.pall...@corp.aol.com> wrote:
> 
>> Hi all,
>>
>> We have an application where we pull logs from an external server(far
> apart
>> from hadoop cluster) to hadoop cluster. Sometimes, we could see huge
> delay
>> (of 1 hour or more) in actually seeing the data in HDFS though the
> file has
>> been closed and the variable is set to null from the external
> application.I
>> was in the impression that when I close the file, the data gets
> reflected in
>> hadoop cluster. Now, in this situation, it is even more complicated to
>> handle write failures as it is giving false impression to the client
> that
>> data has been written to HDFS. Kindly clarify if my perception is
> correct.
>> If yes, Could some one tell me what is causing the delay in actually
> showing
>> the data. During those cases, how can we tackle write failures (due to
> some
>> temporary issues like data node not available, disk is full) as there
> is no
>> way, we can figure out the failure at the client side?
>>
>> Thanks
>> Pallavi
>>
> 
> 
> 

Reply via email to