Re: Fully distribute TextInputFormat...

Alex Baranov Mon, 10 May 2010 22:27:54 -0700

I meant splitting of very huge file to distribute it over multiple Map jobs.


Alex.

http://sematext.com

On Tue, May 11, 2010 at 6:13 AM, himanshu chandola <
himanshu_cool...@yahoo.com> wrote:

> Actually would you have a case when no splitting is needed. Just curious.
>
> It seems that you would use LZO or not use any compression at all.
>
> H
>
> ----- Original Message ----
> From: Alex Baranov <alex.barano...@gmail.com>
> To: common-user@hadoop.apache.org
> Sent: Mon, May 10, 2010 4:27:11 PM
> Subject: Re: Fully distribute TextInputFormat...
>
> If I'm not mistaken LZO compression better suits when splitting needed, not
> gzip.
>
> Alex Baranau
>
> http://sematext.com
>
> On Mon, May 10, 2010 at 3:52 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>
> > What's the format of this file ? gzip can been split.
> >
> >
> >
> > On Mon, May 10, 2010 at 5:21 AM, Pierre ANCELOT <pierre...@gmail.com>
> > wrote:
> > > Hi folks :)
> > > I have one big file... I read it with FileInputFormat, this generates
> > only
> > > one task and of course, this doesn't get distributed across the cluster
> > > nodes.
> > > Should I use an other Input class or do I have a bug in my
> > implementation?
> > >
> > > The desired behavior is one task per line.
> > >
> > > Thanks.
> > >
> > >
> > >
> > > --
> > > http://www.neko-consulting.com
> > > Ego sum quis ego servo
> > > "Je suis ce que je protège"
> > > "I am what I protect"
> > >
> >
> >
> >
> > --
> > Best Regards
> >
> > Jeff Zhang
> >
>
>
>
>
>

Re: Fully distribute TextInputFormat...

Reply via email to