Hi Raghu, The file doesn't appear in the cluster when I saw it from Namenode UI. Also, I have a monitor at cluster side which checks whether file is created and throws an exception when it is not created. And, it threw an exception saying "File not found".
Thanks Pallavi ----- Original Message ----- From: "Raghu Angadi" <rang...@yahoo-inc.com> To: common-user@hadoop.apache.org Sent: Wednesday, August 12, 2009 12:10:12 AM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi Subject: Re: File is closed but data is not visible Your assumption is correct. When you close the file, others can read the data. There is no delay expected before the data is visible. If there is an error either write() or close() would throw an error. When you say data is not visible do you mean readers can not see the file or can not see the data? Is it guaranteed that readers open the file _after_ close returns on the writer? Raghu. Palleti, Pallavi wrote: > Hi Jason, > > Apologies for missing version information in my previous mail. I am > using hadoop-0.18.3. I am getting FSDataOutputStream object using > fs.create(new Path(some_file_name)), where fs is FileSystem object. And, > I am closing the file using close(). > > Thanks > Pallavi > > -----Original Message----- > From: Jason Venner [mailto:jason.had...@gmail.com] > Sent: Tuesday, August 11, 2009 6:24 PM > To: common-user@hadoop.apache.org > Subject: Re: File is closed but data is not visible > > Please provide information on what version of hadoop you are using and > the > method of opening and closing the file. > > > On Tue, Aug 11, 2009 at 12:48 AM, Pallavi Palleti < > pallavi.pall...@corp.aol.com> wrote: > >> Hi all, >> >> We have an application where we pull logs from an external server(far > apart >> from hadoop cluster) to hadoop cluster. Sometimes, we could see huge > delay >> (of 1 hour or more) in actually seeing the data in HDFS though the > file has >> been closed and the variable is set to null from the external > application.I >> was in the impression that when I close the file, the data gets > reflected in >> hadoop cluster. Now, in this situation, it is even more complicated to >> handle write failures as it is giving false impression to the client > that >> data has been written to HDFS. Kindly clarify if my perception is > correct. >> If yes, Could some one tell me what is causing the delay in actually > showing >> the data. During those cases, how can we tackle write failures (due to > some >> temporary issues like data node not available, disk is full) as there > is no >> way, we can figure out the failure at the client side? >> >> Thanks >> Pallavi >> > > >