Re: Spark Random Forest training cost same time on yarn as on standalone

2016-10-20 Thread Xi Shen
If you are running on your local, I do not see the point that you start with 32 executors with 2 cores for each. Also, you can check the Spark web console to find out where the time spent. Also, you may want to read http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/

Spark Random Forest training cost same time on yarn as on standalone

2016-10-20 Thread 陈哲
I'm training random forest model using spark2.0 on yarn with cmd like: $SPARK_HOME/bin/spark-submit \ --class com.netease.risk.prediction.HelpMain --master yarn --deploy-mode client --driver-cores 1 --num-executors 32 --executor-cores 2 --driver-memory 10g --executor-memory 6g \ --conf