Hello!
  Thank you for your responses. I've read in this posts
http://stackoverflow.com/questions/5598400/hdfs-using-hdfs-api-to-append-to-a-sequencefile 
also
https://issues.apache.org/jira/browse/HADOOP-3977

that you cannot add new fresh data in an existing SequenceFile. So, 
basically, you have the scenario:
1. Writing to a SequenceFile
2. Close the file

2. Reopen the written file
3. Add new fresh data to it
4. Close the file
At the end you'll have the old data plus new added data. Can you have an 
example (code) how you can achieve this scenario with the API? Please specify 
which version you're using. 

Thank you.

Regards,
  Florin




________________________________
 From: Ioan Eugen Stan <stan.ieu...@gmail.com>
To: hdfs-user@hadoop.apache.org; Florin P <florinp...@yahoo.com> 
Sent: Friday, April 13, 2012 1:23 PM
Subject: Re: Is append allowed in HDFS?
 
2012/4/13 Florin P <florinp...@yahoo.com>:
> Hello!
>  Thank you all for all responses. It is possible to have a matrix of
> hadoop
> file input format that supports append or if I understood correctly, all
> formats are now supporting append?
> Thanks a lot.
>   Regards,
>  Florin

Hi Florin,

Append is a file-system feature not a file format feature although
some file formats are designed to be immutable (MapFile, HFile). You
can append to them, just don't use the interface they normally
provide.

> ________________________________
> From: Inder Pall <inder.p...@gmail.com>
> To: hdfs-user@hadoop.apache.org
> Sent: Tuesday, April 10, 2012 8:12 AM
> Subject: Re: Is append allowed in HDFS?
>
> Harsh,
>
> idea is to call sync for a configured batch. Still under implementation as
> other parts of the system's aren't complete.
>
> recovery/resume-from-errors-at-DN code around general tail-like
>>>This sounds promising, can you please shed some more light on this.
>
> - inder
> On Tue, Apr 10, 2012 at 1:07 AM, Harsh J <ha...@cloudera.com> wrote:
>
> Your approach looks fine to me. I'd throw in some
> recovery/resume-from-errors-at-DN code around general tail-like
> consumption but I think you may have already done that :)
>
> But just for my curiosity - do you call sync for every record/unit or
> batch it by a few, for your problem?
>
> On Mon, Apr 9, 2012 at 10:34 PM, Inder Pall <inder.p...@gmail.com> wrote:
>> Yes makes sense. My use-case is more like a producer/consumer and
>> consumer
>> trying to stream data as it arrives.
>> Has anyone hit this before and if so resolved it in a better way.
>>
>> Apologies, if i am digressing from the subject of this thread however
>> seems
>> to land in the bucket of append support in HDFS.
>>
>> - Inder
>>
>>
>> On Mon, Apr 9, 2012 at 6:27 PM, Harsh J <ha...@cloudera.com> wrote:
>>>
>>> Inder,
>>>
>>> Yes, that is a requirement for readers of sync-ing data. The new meta
>>> entries can only be read by new readers. The read code would end up
>>> being exactly like the implementation for method "fs -tail" at
>>>
>>>
>>>
>>> http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/src/core/org/apache/hadoop/fs/FsShell.java?view=markup
>>> (Line 1101)
>>>
>>> HBase does not read the WAL (HLog) continuously/vigorously as it
>>> syncs, by the way. It only reads the them when a specific request is
>>> made (for splitting, replaying and debug-printing).
>>>
>>> On Mon, Apr 9, 2012 at 6:05 PM, Inder Pall <inder.p...@gmail.com> wrote:
>>> > Based on what i have tried, after a sync you need to open a new
>>> > Reader.
>>> > Please correct if that's not the write semantics.
>>> >
>>> > Thanks,
>>> > - Inder
>>> >
>>> >
>>> > On Mon, Apr 9, 2012 at 4:23 PM, Harsh J <ha...@cloudera.com> wrote:
>>> >>
>>> >> I'd also like to note that there are some unresolved issues with the
>>> >> append version in the 1.x (stable) line.
>>> >>
>>> >> Note that HBase's use of the 0.20-append branch features are limited
>>> >> to using "sync" calls alone (Described in p68 "Coherency Model",
>>> >> Chapter 3 (The Hadoop Distributed File System) in Hadoop: The
>>> >> Definitive Guide, 2nd Edition (O'Reilly)). Not the file-reopening
>>> >> "append" calls. The latter is what is still with issues in the 1.x
>>> >> releases today. Using the former is alright if its done in the way
>>> >> similar to HBase's WAL (HLog) (or for similar needs).
>>> >>
>>> >> On Mon, Apr 9, 2012 at 3:45 PM, Ioan Eugen Stan
>>> >> <stan.ieu...@gmail.com>
>>> >> wrote:
>>> >> > 2012/4/7 Florin P <florinp...@yahoo.com>:
>>> >> >> Hello!
>>> >> >>   Just google it for supporting of append into HDFS files and the
>>> >> >> result:
>>> >> >> I'm puzzled. Can someone say: YES you can append in TextFile or
>>> >> >> SequenceFile
>>> >> >> or whatever format. If yes, in which version this feature is
>>> >> >> supported
>>> >> >> ?
>>> >> >> Also where can I find a good example of using the API? I know that
>>> >> >> is a
>>> >> >> long
>>> >> >> debate about this subject, but really it is challenge to find on
>>> >> >> the
>>> >> >> google
>>> >> >> the current status of this feature.
>>> >> >> I look forward for a trust source answer.
>>> >> >> Thank you,
>>> >> >>   Regards,
>>> >> >>    Florin
>>> >> >
>>> >> > Hi Florian,
>>> >> >
>>> >> > HDFS supports append in Hadoop 1.0.x branch and also 0.22 (a.k.a
>>> >> > hadoop 2.x branch) and 0.23 (a.k.a hadoop 3.x branch).
>>> >> >
>>> >> > [1] http://hbase.apache.org/book/hadoop.html
>>> >> > [2] http://hbase.apache.org/book/hadoop.html -- search for append
>>> >> > in
>>> >> > release notes
>>> >> >
>>> >> > Cheers,
>>> >> > --
>>> >> > Ioan Eugen Stan
>>> >> > http://ieugen.blogspot.com/
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Harsh J
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Thanks,
>>> > - Inder
>>> >   Tech Platforms @Inmobi
>>> >   Linkedin - http://goo.gl/eR4Ub
>>>
>>>
>>>
>>> --
>>> Harsh J
>>
>>
>>
>>
>> --
>> Thanks,
>> - Inder
>>   Tech Platforms @Inmobi
>>   Linkedin - http://goo.gl/eR4Ub
>
>
>
> --
> Harsh J
>
>
>
>
> --
> Thanks,
> - Inder
>   Tech Platforms @Inmobi
>   Linkedin - http://goo.gl/eR4Ub
>
>



--
Ioan Eugen Stan
http://ieugen.blogspot.com/

Reply via email to