when running spark jobs we find when running the following command: top -H -i -p <pid> showed that a single thread labeled "map-output-disp" was running at 99.7% for a majority of the delay period. this delay gets progressively worse with the increase in partition count.
it seems the delay comes from this class org.apache.spark.MapOutputTracker located in the core code is there anyway to speed this up?