Hi,
     I just test groupByKey method on a 100GB data, the cluster is 20
machine, each with 125GB RAM.

    At first I set  conf.set("spark.shuffle.use.netty", "false") and run
the experiment, and then I set conf.set("spark.shuffle.use.netty", "true")
again to re-run the experiment, but at the latter case, the GC time is much
higher。


 I thought the latter one should be better, but it is not. So when should
we use netty for network shuffle fetching?

Reply via email to