Re: Quick Question: LineSplit or BlockSplit

Mark Kerzner Mon, 07 Feb 2011 18:33:15 -0800

Thanks!
Mark

On Mon, Feb 7, 2011 at 8:28 PM, Ted Dunning <tdunn...@maprtech.com> wrote:


> That is quite doable.  One way to do it is to make the max split size quite
> small.
>
> On Mon, Feb 7, 2011 at 6:14 PM, Mark Kerzner <markkerz...@gmail.com>
> wrote:
>
> > Ted,
> >
> > I am also interested in this answer.
> >
> > I put the name of a zip file on a line in an input file, and I want one
> > mapper to read this line, and start working on it (since it now knows the
> > path in HDFS). Are you saying it's not doable?
> >
> > Thank you,
> > Mark
> >
> > On Mon, Feb 7, 2011 at 8:10 PM, Ted Dunning <tdunn...@maprtech.com>
> wrote:
> >
> > > Option (1) isn't the way that things normally work.  Besides, mappers
> are
> > > called many times for each construction of a mapper.
> > >
> > > On Mon, Feb 7, 2011 at 3:38 PM, maha <m...@umail.ucsb.edu> wrote:
> > >
> > > > Hi,
> > > >
> > > >  I would appreciate it if you could give me your thoughts if there is
> > > > affect on efficiency if:
> > > >
> > > >  1) Mappers were per line in a document
> > > >
> > > >  or
> > > >
> > > >  2) Mappers were per block of lines in a document.
> > > >
> > > >
> > > >  I know the obvious difference I can see is that (1) has more
> mappers.
> > > Does
> > > > that mean (1) will be slower because of scheduling time ?
> > > >
> > > > Thank you,
> > > > Maha
> > > >
> > >
> >
>

Re: Quick Question: LineSplit or BlockSplit

Reply via email to