> Anyway.. wanna to ask if there is any Tez configuration or future >release (running Tez 0.6) which might improve the disk utilisation during >such heavyweight sorts !?
Make sure compression is turned on. Everytime I¹ve seen this issue, it had to do with someone turning off compression due to a bad libsnappy install. tez.runtime.compress/tez.runtime.compress.codec If you happen to use DefaultCodec, remember to set zlib.compress.level=BEST_SPEED (is not an int) as well in the job conf. Further up in 0.8.x land, we don¹t do full merges if the pipelined shuffle is turned on, which for a 6 disk system allows a single skewed task to be about 12x bigger before hitting this exception. Cheers, Gopal
