Re: performance of multiple map-reduce operations

Chris Dyer Tue, 06 Nov 2007 17:05:20 -0800

On 11/6/07, Doug Cutting <[EMAIL PROTECTED]> wrote:
>
> Joydeep Sen Sarma wrote:
> > One of the controversies is whether in the presence of failures, this
> > makes performance worse rather than better (kind of like udp vs. tcp -
> > what's better depends on error rate). The probability of a failure per
> > job will increase non-linearly as the number of nodes involved per job
> > increases. So what might make sense for small clusters may not make
> > sense for bigger ones. But it sure would be nice to have this option.
>
> Hmm.  Personally I wouldn't put a very high priority on complicated
> features that don't scale well.



I don't think this is necessarily that what we're asking for is complicated,
and I think it can be made to scale quite well.  The critique is appropriate
for the current "polyreduce" prototype, but that is a particular solution to
a general problem.  And *scaling* is exactly the issue that we are trying to
address here (since the current model is quite wasteful of computational
resources).

As for complexity, I do agree that the current prototype that Joydeep linked
to is needlessly cumbersome, but, to solve this problem theoretically, the
only additional information that would need to be represented to a scheduler
to plan a composite job is just a dependency graph between the mappers and
reducers.   Dependency graphs are about as clear as it gets in this
business, and they certainly aren't going to have any scaling problems and
conceptual problems.

I don't really see how failures would be more catastrophic if this is
implemented purely in terms of smarter scheduling (which is adamantly not
what the current polyreduce prototype does).  You're running the exact same
jobs you would be running otherwise, just sooner. :)

Chris

Re: performance of multiple map-reduce operations

Reply via email to