I meant splitting of very huge file to distribute it over multiple Map jobs.
Alex. http://sematext.com On Tue, May 11, 2010 at 6:13 AM, himanshu chandola < himanshu_cool...@yahoo.com> wrote: > Actually would you have a case when no splitting is needed. Just curious. > > It seems that you would use LZO or not use any compression at all. > > H > > ----- Original Message ---- > From: Alex Baranov <alex.barano...@gmail.com> > To: common-user@hadoop.apache.org > Sent: Mon, May 10, 2010 4:27:11 PM > Subject: Re: Fully distribute TextInputFormat... > > If I'm not mistaken LZO compression better suits when splitting needed, not > gzip. > > Alex Baranau > > http://sematext.com > > On Mon, May 10, 2010 at 3:52 PM, Jeff Zhang <zjf...@gmail.com> wrote: > > > What's the format of this file ? gzip can been split. > > > > > > > > On Mon, May 10, 2010 at 5:21 AM, Pierre ANCELOT <pierre...@gmail.com> > > wrote: > > > Hi folks :) > > > I have one big file... I read it with FileInputFormat, this generates > > only > > > one task and of course, this doesn't get distributed across the cluster > > > nodes. > > > Should I use an other Input class or do I have a bug in my > > implementation? > > > > > > The desired behavior is one task per line. > > > > > > Thanks. > > > > > > > > > > > > -- > > > http://www.neko-consulting.com > > > Ego sum quis ego servo > > > "Je suis ce que je protège" > > > "I am what I protect" > > > > > > > > > > > -- > > Best Regards > > > > Jeff Zhang > > > > > > >