I am trying to look for examples that demonstrates using sequence files
including writing to it and then running mapred on it, but unable to find
one. Could you please point me to some examples of sequence files?

On Tue, Feb 21, 2012 at 10:25 AM, Bejoy Ks <bejoy.had...@gmail.com> wrote:

> Hi Mohit
>      AFAIK XMLLoader in pig won't be suited for Sequence Files. Please
> post the same to Pig user group for some workaround over the same.
>         SequenceFIle is a preferred option when we want to store small
> files in hdfs and needs to be processed by MapReduce as it stores data in
> key value format.Since SequenceFileInputFormat is available at your
> disposal you don't need any custom input formats for processing the same
> using map reduce. It is a cleaner and better approach compared to just
> appending small xml file contents into a big file.
>
> On Tue, Feb 21, 2012 at 11:00 PM, Mohit Anchlia <mohitanch...@gmail.com
> >wrote:
>
> > On Tue, Feb 21, 2012 at 9:25 AM, Bejoy Ks <bejoy.had...@gmail.com>
> wrote:
> >
> > > Mohit
> > >       Rather than just appending the content into a normal text file or
> > > so, you can create a sequence file with the individual smaller file
> > content
> > > as values.
> > >
> > >  Thanks. I was planning to use pig's
> > org.apache.pig.piggybank.storage.XMLLoader
> > for processing. Would it work with sequence file?
> >
> > This text file that I was referring to would be in hdfs itself. Is it
> still
> > different than using sequence file?
> >
> > > Regards
> > > Bejoy.K.S
> > >
> > > On Tue, Feb 21, 2012 at 10:45 PM, Mohit Anchlia <
> mohitanch...@gmail.com
> > > >wrote:
> > >
> > > > We have small xml files. Currently I am planning to append these
> small
> > > > files to one file in hdfs so that I can take advantage of splits,
> > larger
> > > > blocks and sequential IO. What I am unsure is if it's ok to append
> one
> > > file
> > > > at a time to this hdfs file
> > > >
> > > > Could someone suggest if this is ok? Would like to know how other do
> > it.
> > > >
> > >
> >
>

Reply via email to