Hi, I just test groupByKey method on a 100GB data, the cluster is 20 machine, each with 125GB RAM.
At first I set conf.set("spark.shuffle.use.netty", "false") and run the experiment, and then I set conf.set("spark.shuffle.use.netty", "true") again to re-run the experiment, but at the latter case, the GC time is much higher。 I thought the latter one should be better, but it is not. So when should we use netty for network shuffle fetching?