Hi list,

I have jobs that generate huge amount of intermediate data. For eg: One of
my job generates almost 12 GB map output. I have 8 datanodes/TTs and 1
master.

My reduce progress shows that the copy speed in range 0.55 - 1 MBps , but
normal file transfers between my datanodes generally go up to 40-50 MBps.
Why is my shuffle speed so slow?

Also how is that number calculated ? What exactly does that signify? (Is it
the avg speed of all mappers to that particular reducer? or anything else?)
 Any suggestions?

Thanks

Reply via email to