This isn't possible presently. If you close the open file stream for a sequence file, you're done with it. I'd advise not to close it and use hflush instead, much like a WAL. Close it only when you're done with some threshold, and open a new file. The hflush (or sync in 1.x) will ensure that the latest additions are available for immediate reads (to all new readers).
The patch at https://issues.apache.org/jira/browse/HADOOP-7139 will help solve this limitation though. Its under review and needs some further work. On Tue, Apr 24, 2012 at 6:47 PM, Florin P <florinp...@yahoo.com> wrote: > Hello! > Thank you for your responses. I've read in this posts > http://stackoverflow.com/questions/5598400/hdfs-using-hdfs-api-to-append-to-a-sequencefile > also > https://issues.apache.org/jira/browse/HADOOP-3977 > > that you cannot add new fresh data in an existing SequenceFile. So, > basically, you have the scenario: > 1. Writing to a SequenceFile > 2. Close the file > 2. Reopen the written file > 3. Add new fresh data to it > 4. Close the file > At the end you'll have the old data plus new added data. Can you have an > example (code) how you can achieve this scenario with the API? Please > specify which version you're using. > > Thank you. > > Regards, > Florin > > ________________________________ > From: Ioan Eugen Stan <stan.ieu...@gmail.com> > To: hdfs-user@hadoop.apache.org; Florin P <florinp...@yahoo.com> > Sent: Friday, April 13, 2012 1:23 PM > > Subject: Re: Is append allowed in HDFS? > > 2012/4/13 Florin P <florinp...@yahoo.com>: >> Hello! >> Thank you all for all responses. It is possible to have a matrix of >> hadoop >> file input format that supports append or if I understood correctly, all >> formats are now supporting append? >> Thanks a lot. >> Regards, >> Florin > > Hi Florin, > > Append is a file-system feature not a file format feature although > some file formats are designed to be immutable (MapFile, HFile). You > can append to them, just don't use the interface they normally > provide. > >> ________________________________ >> From: Inder Pall <inder.p...@gmail.com> >> To: hdfs-user@hadoop.apache.org >> Sent: Tuesday, April 10, 2012 8:12 AM >> Subject: Re: Is append allowed in HDFS? >> >> Harsh, >> >> idea is to call sync for a configured batch. Still under implementation as >> other parts of the system's aren't complete. >> >> recovery/resume-from-errors-at-DN code around general tail-like >>>>This sounds promising, can you please shed some more light on this. >> >> - inder >> On Tue, Apr 10, 2012 at 1:07 AM, Harsh J <ha...@cloudera.com> wrote: >> >> Your approach looks fine to me. I'd throw in some >> recovery/resume-from-errors-at-DN code around general tail-like >> consumption but I think you may have already done that :) >> >> But just for my curiosity - do you call sync for every record/unit or >> batch it by a few, for your problem? >> >> On Mon, Apr 9, 2012 at 10:34 PM, Inder Pall <inder.p...@gmail.com> wrote: >>> Yes makes sense. My use-case is more like a producer/consumer and >>> consumer >>> trying to stream data as it arrives. >>> Has anyone hit this before and if so resolved it in a better way. >>> >>> Apologies, if i am digressing from the subject of this thread however >>> seems >>> to land in the bucket of append support in HDFS. >>> >>> - Inder >>> >>> >>> On Mon, Apr 9, 2012 at 6:27 PM, Harsh J <ha...@cloudera.com> wrote: >>>> >>>> Inder, >>>> >>>> Yes, that is a requirement for readers of sync-ing data. The new meta >>>> entries can only be read by new readers. The read code would end up >>>> being exactly like the implementation for method "fs -tail" at >>>> >>>> >>>> >>>> >>>> http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/src/core/org/apache/hadoop/fs/FsShell.java?view=markup >>>> (Line 1101) >>>> >>>> HBase does not read the WAL (HLog) continuously/vigorously as it >>>> syncs, by the way. It only reads the them when a specific request is >>>> made (for splitting, replaying and debug-printing). >>>> >>>> On Mon, Apr 9, 2012 at 6:05 PM, Inder Pall <inder.p...@gmail.com> wrote: >>>> > Based on what i have tried, after a sync you need to open a new >>>> > Reader. >>>> > Please correct if that's not the write semantics. >>>> > >>>> > Thanks, >>>> > - Inder >>>> > >>>> > >>>> > On Mon, Apr 9, 2012 at 4:23 PM, Harsh J <ha...@cloudera.com> wrote: >>>> >> >>>> >> I'd also like to note that there are some unresolved issues with the >>>> >> append version in the 1.x (stable) line. >>>> >> >>>> >> Note that HBase's use of the 0.20-append branch features are limited >>>> >> to using "sync" calls alone (Described in p68 "Coherency Model", >>>> >> Chapter 3 (The Hadoop Distributed File System) in Hadoop: The >>>> >> Definitive Guide, 2nd Edition (O'Reilly)). Not the file-reopening >>>> >> "append" calls. The latter is what is still with issues in the 1.x >>>> >> releases today. Using the former is alright if its done in the way >>>> >> similar to HBase's WAL (HLog) (or for similar needs). >>>> >> >>>> >> On Mon, Apr 9, 2012 at 3:45 PM, Ioan Eugen Stan >>>> >> <stan.ieu...@gmail.com> >>>> >> wrote: >>>> >> > 2012/4/7 Florin P <florinp...@yahoo.com>: >>>> >> >> Hello! >>>> >> >> Just google it for supporting of append into HDFS files and the >>>> >> >> result: >>>> >> >> I'm puzzled. Can someone say: YES you can append in TextFile or >>>> >> >> SequenceFile >>>> >> >> or whatever format. If yes, in which version this feature is >>>> >> >> supported >>>> >> >> ? >>>> >> >> Also where can I find a good example of using the API? I know that >>>> >> >> is a >>>> >> >> long >>>> >> >> debate about this subject, but really it is challenge to find on >>>> >> >> the >>>> >> >> google >>>> >> >> the current status of this feature. >>>> >> >> I look forward for a trust source answer. >>>> >> >> Thank you, >>>> >> >> Regards, >>>> >> >> Florin >>>> >> > >>>> >> > Hi Florian, >>>> >> > >>>> >> > HDFS supports append in Hadoop 1.0.x branch and also 0.22 (a.k.a >>>> >> > hadoop 2.x branch) and 0.23 (a.k.a hadoop 3.x branch). >>>> >> > >>>> >> > [1] http://hbase.apache.org/book/hadoop.html >>>> >> > [2] http://hbase.apache.org/book/hadoop.html -- search for append >>>> >> > in >>>> >> > release notes >>>> >> > >>>> >> > Cheers, >>>> >> > -- >>>> >> > Ioan Eugen Stan >>>> >> > http://ieugen.blogspot.com/ >>>> >> >>>> >> >>>> >> >>>> >> -- >>>> >> Harsh J >>>> > >>>> > >>>> > >>>> > >>>> > -- >>>> > Thanks, >>>> > - Inder >>>> > Tech Platforms @Inmobi >>>> > Linkedin - http://goo.gl/eR4Ub >>>> >>>> >>>> >>>> -- >>>> Harsh J >>> >>> >>> >>> >>> -- >>> Thanks, >>> - Inder >>> Tech Platforms @Inmobi >>> Linkedin - http://goo.gl/eR4Ub >> >> >> >> -- >> Harsh J >> >> >> >> >> -- >> Thanks, >> - Inder >> Tech Platforms @Inmobi >> Linkedin - http://goo.gl/eR4Ub >> >> > > > > -- > Ioan Eugen Stan > http://ieugen.blogspot.com/ > > -- Harsh J