Hi all, I just ran into an issue, which likely resulted from my not very intelligent configuration, but nonetheless I'd like to share this with the community. This is all on Hadoop 2.7.3.
In my setup, each reducer roughly fetched 65K from each mapper's spill file. I disabled transferTo during shuffle, because I wanted to have a look at the file system statistics, which miss mmap calls, which is what transferTo sometimes defaults to. I left the shuffle buffer size at 128K (not knowing about the parameter at the time). This had the effect that I observed roughly 100% more data being read during shuffle, since 128K were read for each 65K needed. I added a quick fix to Hadoop which chooses the minimum of the partition size and the shuffle buffer size: https://github.com/apache/hadoop/compare/branch-2.7.3...robert-schmidtke:adaptive-shuffle-buffer Benchmarking this version against transferTo.allowed=true yields the same runtime and roughly 10% more reads in YARN during the shuffle phase (compared to previous 100%). Maybe this is something that should be added to Hadoop? Or do users have to be more clever about their job configurations? I'd be happy to open a PR if this is deemed useful. Anyway, thanks for the attention! Cheers Robert -- My GPG Key ID: 336E2680