Re: Reducer goes past 100% complete?

Nathan Marz Mon, 09 Mar 2009 12:20:57 -0700

I have the same problem with reducers going past 100% on some jobs.I've seen reducers go as high as 120%. Would love to know what theissue is.


On Mar 9, 2009, at 8:45 AM, Doug Cook wrote:

Hi folks,
I've recently upgraded to Hadoop 0.19.1 from a much, much olderversion of
Hadoop.

Most things in my application (a highly modified version of Nutch) are
working just fine, but one of them is bombing out with odd symptoms.The mapworks just fine, but then reduce phase (a) runs extremely slowly and(b) the"percentage complete" reporting for each reduce task doesn't stop at100%,
it just keeps going on past that.

I figure I'll start by understanding the percentage-complete reporting
issue, since it's pretty concrete and may have some bearing on the
performance issue. It seems likely that my application is mis-configuringthe job, or otherwise not correctly using the Hadoop API. I don'tthink I'mdoing anything way out of the ordinary; my reducer simply creates anobject,wraps it in an ObjectWritable, and calls output.collect(), and Ihave alocal class that implements OutputFormat to take the object and putit in aLucene index. It does actually create correct output, at least forsmall
indices; on large indices, the performance problems are killing me.
I can and will start rummaging around in the Hadoop code to figureout howit calculates percentage complete, and see what I'm not doingcorrectly, butthought I'd ask here, too, to see if someone has good suggestionsoff the
top of their head.

Many thanks-

Doug Cook
--
View this message in context: 
http://www.nabble.com/Reducer-goes-past-100--complete--tp22413589p22413589.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: Reducer goes past 100% complete?

Reply via email to