[jira] [Commented] (SPARK-9092) Make --num-executors compatible with dynamic allocation
[ https://issues.apache.org/jira/browse/SPARK-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644809#comment-14644809 ] Andrew Or commented on SPARK-9092: -- To add to [~sandyryza]'s comment, I think there's actually an argument both ways: - Dynamic allocation is not fundamental to Spark and is really an opt-in feature. If the user explicitly set `spark.dynamicAllocation.enabled`, it is likely that they actually intended to use it. If they accidentally set `--num-executors` and we don't even log a warning, then they may mistakenly believe that they're sharing the cluster efficiently when they are not. - On the flip side, there is no way to enable dynamic allocation by default in a cluster-wide setting otherwise. In fact, this precludes us from ever making dynamic allocation the default in Spark itself even in the distant future (though I don't personally see this happening any time soon). Because of the latter, I do agree we should have `--num-executors` override dynamic allocation, but it's not because it's fundamentally more correct. No strong opinion one way or the other. > Make --num-executors compatible with dynamic allocation > --- > > Key: SPARK-9092 > URL: https://issues.apache.org/jira/browse/SPARK-9092 > Project: Spark > Issue Type: Improvement > Components: YARN >Affects Versions: 1.2.0 >Reporter: Niranjan Padmanabhan > > Currently when you enable dynamic allocation, you can't use --num-executors > or the property spark.executor.instances. If we are to enable dynamic > allocation by default, we should make these work so that existing workloads > don't fail -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9092) Make --num-executors compatible with dynamic allocation
[ https://issues.apache.org/jira/browse/SPARK-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641317#comment-14641317 ] Niranjan Padmanabhan commented on SPARK-9092: - [~srowen], be happy to do so. The PR is here: https://github.com/apache/spark/pull/7657 Is there a way to configure settings somewhere so that JIRA automatically recognizes the PR in Github for this task? > Make --num-executors compatible with dynamic allocation > --- > > Key: SPARK-9092 > URL: https://issues.apache.org/jira/browse/SPARK-9092 > Project: Spark > Issue Type: Improvement > Components: YARN >Affects Versions: 1.2.0 >Reporter: Niranjan Padmanabhan > > Currently when you enable dynamic allocation, you can't use --num-executors > or the property spark.executor.instances. If we are to enable dynamic > allocation by default, we should make these work so that existing workloads > don't fail -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9092) Make --num-executors compatible with dynamic allocation
[ https://issues.apache.org/jira/browse/SPARK-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641314#comment-14641314 ] Apache Spark commented on SPARK-9092: - User 'neurons' has created a pull request for this issue: https://github.com/apache/spark/pull/7657 > Make --num-executors compatible with dynamic allocation > --- > > Key: SPARK-9092 > URL: https://issues.apache.org/jira/browse/SPARK-9092 > Project: Spark > Issue Type: Improvement > Components: YARN >Affects Versions: 1.2.0 >Reporter: Niranjan Padmanabhan > > Currently when you enable dynamic allocation, you can't use --num-executors > or the property spark.executor.instances. If we are to enable dynamic > allocation by default, we should make these work so that existing workloads > don't fail -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9092) Make --num-executors compatible with dynamic allocation
[ https://issues.apache.org/jira/browse/SPARK-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638689#comment-14638689 ] Sean Owen commented on SPARK-9092: -- I tend to agree with [~sandyr] for these reasons. The issue here indeed is exactly that the user wants to override a sensible cluster default. [~niranjanvp] would you like to try a PR? > Make --num-executors compatible with dynamic allocation > --- > > Key: SPARK-9092 > URL: https://issues.apache.org/jira/browse/SPARK-9092 > Project: Spark > Issue Type: Improvement > Components: YARN >Affects Versions: 1.2.0 >Reporter: Niranjan Padmanabhan > > Currently when you enable dynamic allocation, you can't use --num-executors > or the property spark.executor.instances. If we are to enable dynamic > allocation by default, we should make these work so that existing workloads > don't fail -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9092) Make --num-executors compatible with dynamic allocation
[ https://issues.apache.org/jira/browse/SPARK-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635942#comment-14635942 ] Sandy Ryza commented on SPARK-9092: --- I had a brief discussion with [~andrewor14] about this offline and wanted to move the discussion public. --num-executors and dynamic allocation are fundamentally at odds with each other in the sense that neither makes sense in the context of the other. This means that essentially one needs to override the other. My position is that it makes more sense for --num-executors to override dynamic allocation than the other way around. I.e. if --num-executors is set, behave as if dynamic allocation were disabled. The advantages of this are: * Cluster operators can turn on dynamic allocation as the default cluster setting without impacting existing applications. The precedent set by existing big data processing frameworks (MR, Impala, Tez) is that users can depend on the framework to determine how many resources to acquire from the cluster manager, so I think it's reasonable that most clusters would want to move to dynamic allocation as the default. In a Cloudera setting, we plan to eventually enable dynamic allocation as the factory default for all clusters, but we'd like to minimize the extent to which we change the behavior of existing apps. * --num-executors is conceptually a more specific property, and specific properties tend to override more general ones. > Make --num-executors compatible with dynamic allocation > --- > > Key: SPARK-9092 > URL: https://issues.apache.org/jira/browse/SPARK-9092 > Project: Spark > Issue Type: Improvement > Components: YARN >Affects Versions: 1.2.0 >Reporter: Niranjan Padmanabhan > > Currently when you enable dynamic allocation, you can't use --num-executors > or the property spark.executor.instances. If we are to enable dynamic > allocation by default, we should make these work so that existing workloads > don't fail -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9092) Make --num-executors compatible with dynamic allocation
[ https://issues.apache.org/jira/browse/SPARK-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635930#comment-14635930 ] Niranjan Padmanabhan commented on SPARK-9092: - More details: In the sample execution of {{spark-shell --master yarn-client --conf spark.dynamicAllocation.enabled=true --conf spark.executor.instances=2}}, a java.lang.IllegalArgumentException is thrown from lines 201 - 203 in ClientArguments.scala. This occurs because of spark.dynamicAllocation.enabled=true and spark.executor.instances=2. > Make --num-executors compatible with dynamic allocation > --- > > Key: SPARK-9092 > URL: https://issues.apache.org/jira/browse/SPARK-9092 > Project: Spark > Issue Type: Improvement > Components: YARN >Reporter: Niranjan Padmanabhan >Priority: Minor > > Currently when you enable dynamic allocation, you can't use --num-executors > or the property spark.executor.instances. If we are to enable dynamic > allocation by default, we should make these work so that existing workloads > don't fail -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org