use netty shuffle for network cause high gc time

lihu Tue, 13 Jan 2015 21:29:39 -0800

Hi,
     I just test groupByKey method on a 100GB data, the cluster is 20
machine, each with 125GB RAM.


    At first I set  conf.set("spark.shuffle.use.netty", "false") and run
the experiment, and then I set conf.set("spark.shuffle.use.netty", "true")
again to re-run the experiment, but at the latter case, the GC time is much
higher。


 I thought the latter one should be better, but it is not. So when should
we use netty for network shuffle fetching?

use netty shuffle for network cause high gc time

Reply via email to