[ https://issues.apache.org/jira/browse/SPARK-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218701#comment-15218701 ]
Marcelo Vanzin commented on SPARK-13723: ---------------------------------------- I'm not a great fan of changing the behavior, but I understand the point. To be clear: {{--num-executors}} is mostly a fancy alias for {{spark.executor.instances}}. Are you suggesting that you'd break that coupling, so that if dynamic allocation is on, it would map to something else? And if both {{spark.executor.instances}} and dynamic allocation are provided, something else would happen (potentially maintaining the current behavior)? > YARN - Change behavior of --num-executors when > spark.dynamicAllocation.enabled true > ----------------------------------------------------------------------------------- > > Key: SPARK-13723 > URL: https://issues.apache.org/jira/browse/SPARK-13723 > Project: Spark > Issue Type: Improvement > Components: YARN > Affects Versions: 2.0.0 > Reporter: Thomas Graves > Priority: Minor > > I think we should change the behavior when --num-executors is specified when > dynamic allocation is enabled. Currently if --num-executors is specified > dynamic allocation is disabled and it just uses a static number of executors. > I would rather see the default behavior changed in the 2.x line. If dynamic > allocation config is on then num-executors goes to max and initial # of > executors. I think this would allow users to easily cap their usage and would > still allow it to free up executors. It would also allow users doing ML start > out with a # of executors and if they are actually caching the data the > executors wouldn't be freed up. So you would get very similar behavior to if > dynamic allocation was off. > Part of the reason for this is when using a static number if generally wastes > resources, especially with people doing adhoc things with spark-shell. It > also has a big affect when people are doing MapReduce/ETL type work loads. > The problem is that people are used to specifying num-executors so if we turn > it on by default in a cluster config its just overridden. > We should also update the spark-submit --help description for --num-executors -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org