Hi:
  Recently we try to submit our spark apps to Yarn-Client Model
  And find that task numbers in A stage is 2810

  But in spark stand alone mode , the same apps need much less tasks

  Then trough debug info , we know that most of these 2810 task runs NULL
DATA

   How can i tuning this?

SUBMIT COMMAND IS :

 spark-submit --num-executors 200 --conf spark.default.parallelism=32
--conf spark.sql.shuffle.partitions=8 --jars
mysql-connector-java.jar,log4j-api-2.3.jar,log4j-core-2.3.jar --master
yarn-client

INFO:

[Stage 0:==>  (1 + 1) / 2][Stage 1:>    (0 + 2) / 2][Stage 3:(125 + 9) /
2433]2]


notice that in stage 3 these is 2433 task, but in standalone model much
less than it

Thanks

Reply via email to