Is there a way to make a Reduce task exit early before it has finished reading 
all of it's data? Basically I'm doing a group by with a sum, and I only want to 
return the top 1000 records say. So I have local class int variable to keep 
track of how many have current been written to the output, and as soon as that 
is exceeded, simply return at the top of the reduce() function.

Is there any way to optimize it even more to tell the Reduce task, "stop 
reading data, I don't need any more data"?

--Aaron

Reply via email to