Steve,

The copy phase may start early, and the slow copy could also just be due to 
unavailability of completed map outputs at this stage. Does your question 
eliminate that case here?

I'd also check the network speeds you get between two slave nodes, and if your 
TaskTracker logs are indicating issues transferring map output requests via 
HTTP.

Also, do you run any form of network filtering stuff, firewalls, etc. that may 
be working at the packet levels? I've seen it cause slowdowns before, but am 
not too sure if that's the case here.

On 04-Nov-2011, at 8:50 PM, Steve Lewis wrote:

> I have been finding a that my cluster is running abnormally slowly
> A typical reduce task reports 
> reduce > copy (113 of 431 at 0.07 MB/s) 
> 70 kb / second is a truely dreadful rate and tasks are running much slower 
> under hadoop than the 
> same code on a the same operations on a single box -
> Where do I look to find why IO operations might  be so slow??
> 
> -- 
> Steven M. Lewis PhD
>  
> 

Reply via email to