bump From: Jonathan Kelly <jonat...@amazon.com<mailto:jonat...@amazon.com>> Date: Tuesday, July 14, 2015 at 4:23 PM To: "user@spark.apache.org<mailto:user@spark.apache.org>" <user@spark.apache.org<mailto:user@spark.apache.org>> Subject: Unable to use dynamicAllocation if spark.executor.instances is set in spark-defaults.conf
I've set up my cluster with a pre-calcualted value for spark.executor.instances in spark-defaults.conf such that I can run a job and have it maximize the utilization of the cluster resources by default. However, if I want to run a job with dynamicAllocation (by passing -c spark.dynamicAllocation.enabled=true to spark-submit), I get this exception: Exception in thread "main" java.lang.IllegalArgumentException: Explicitly setting the number of executors is not compatible with spark.dynamicAllocation.enabled! at org.apache.spark.deploy.yarn.ClientArguments.parseArgs(ClientArguments.scala:192) at org.apache.spark.deploy.yarn.ClientArguments.<init>(ClientArguments.scala:59) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:54) ... The exception makes sense, of course, but ideally I would like it to ignore what I've put in spark-defaults.conf for spark.executor.instances if I've enabled dynamicAllocation. The most annoying thing about this is that if I have spark.executor.instances present in spark-defaults.conf, I cannot figure out any way to spark-submit a job with spark.dynamicAllocation.enabled=true without getting this error. That is, even if I pass "-c spark.executor.instances=0 -c spark.dynamicAllocation.enabled=true", I still get this error because the validation in ClientArguments.parseArgs() that's checking for this condition simply checks for the presence of spark.executor.instances rather than whether or not its value is > 0. Should the check be changed to allow spark.executor.instances to be set to 0 if spark.dynamicAllocation.enabled is true? That would be an OK compromise, but I'd really prefer to be able to enable dynamicAllocation simply by setting spark.dynamicAllocation.enabled=true rather than by also having to set spark.executor.instances to 0. Thanks, Jonathan