The task has been running several hours and the map phase is essentially a null mapper - rewrite the key and value stored by an earlier reducer. There is no firewall - the entire job is running on an internal cluster - admitted launched from my local box on the company network - it is running WAY slower than jobs previously run on the same hardware and I suspect something is wring but lack the tools to even start diagnosing the issue
On Fri, Nov 4, 2011 at 9:07 AM, Harsh J <ha...@cloudera.com> wrote: > Steve, > > The copy phase may start early, and the slow copy could also just be due > to unavailability of completed map outputs at this stage. Does your > question eliminate that case here? > > I'd also check the network speeds you get between two slave nodes, and if > your TaskTracker logs are indicating issues transferring map output > requests via HTTP. > > Also, do you run any form of network filtering stuff, firewalls, etc. that > may be working at the packet levels? I've seen it cause slowdowns before, > but am not too sure if that's the case here. > > On 04-Nov-2011, at 8:50 PM, Steve Lewis wrote: > > I have been finding a that my cluster is running abnormally slowly > A typical reduce task reports > reduce > copy (113 of 431 at 0.07 MB/s) > 70 kb / second is a truely dreadful rate and tasks are running much slower > under hadoop than the > same code on a the same operations on a single box - > Where do I look to find why IO operations might be so slow?? > > -- > Steven M. Lewis PhD > > > > -- Steven M. Lewis PhD 4221 105th Ave NE Kirkland, WA 98033 206-384-1340 (cell) Skype lordjoe_com