Yeah, we could make it a log a warning instead. 2015-07-15 14:29 GMT-07:00 Kelly, Jonathan <jonat...@amazon.com>:
> Thanks! Is there an existing JIRA I should watch? > > > ~ Jonathan > > From: Sandy Ryza <sandy.r...@cloudera.com> > Date: Wednesday, July 15, 2015 at 2:27 PM > To: Jonathan Kelly <jonat...@amazon.com> > Cc: "user@spark.apache.org" <user@spark.apache.org> > Subject: Re: Unable to use dynamicAllocation if spark.executor.instances > is set in spark-defaults.conf > > Hi Jonathan, > > This is a problem that has come up for us as well, because we'd like > dynamic allocation to be turned on by default in some setups, but not break > existing users with these properties. I'm hoping to figure out a way to > reconcile these by Spark 1.5. > > -Sandy > > On Wed, Jul 15, 2015 at 3:18 PM, Kelly, Jonathan <jonat...@amazon.com> > wrote: > >> Would there be any problem in having spark.executor.instances (or >> --num-executors) be completely ignored (i.e., even for non-zero values) if >> spark.dynamicAllocation.enabled is true (i.e., rather than throwing an >> exception)? >> >> I can see how the exception would be helpful if, say, you tried to pass >> both "-c spark.executor.instances" (or --num-executors) *and* "-c >> spark.dynamicAllocation.enabled=true" to spark-submit on the command line >> (as opposed to having one of them in spark-defaults.conf and one of them in >> the spark-submit args), but currently there doesn't seem to be any way to >> distinguish between arguments that were actually passed to spark-submit and >> settings that simply came from spark-defaults.conf. >> >> If there were a way to distinguish them, I think the ideal situation >> would be for the validation exception to be thrown only if >> spark.executor.instances and spark.dynamicAllocation.enabled=true were both >> passed via spark-submit args or were both present in spark-defaults.conf, >> but passing spark.dynamicAllocation.enabled=true to spark-submit would take >> precedence over spark.executor.instances configured in spark-defaults.conf, >> and vice versa. >> >> >> Jonathan Kelly >> >> Elastic MapReduce - SDE >> >> Blackfoot (SEA33) 06.850.F0 >> >> From: Jonathan Kelly <jonat...@amazon.com> >> Date: Tuesday, July 14, 2015 at 4:23 PM >> To: "user@spark.apache.org" <user@spark.apache.org> >> Subject: Unable to use dynamicAllocation if spark.executor.instances is >> set in spark-defaults.conf >> >> I've set up my cluster with a pre-calcualted value for >> spark.executor.instances in spark-defaults.conf such that I can run a job >> and have it maximize the utilization of the cluster resources by default. >> However, if I want to run a job with dynamicAllocation (by passing -c >> spark.dynamicAllocation.enabled=true to spark-submit), I get this exception: >> >> Exception in thread "main" java.lang.IllegalArgumentException: >> Explicitly setting the number of executors is not compatible with >> spark.dynamicAllocation.enabled! >> at >> org.apache.spark.deploy.yarn.ClientArguments.parseArgs(ClientArguments.scala:192) >> at >> org.apache.spark.deploy.yarn.ClientArguments.<init>(ClientArguments.scala:59) >> at >> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:54) >> … >> >> The exception makes sense, of course, but ideally I would like it to >> ignore what I've put in spark-defaults.conf for spark.executor.instances if >> I've enabled dynamicAllocation. The most annoying thing about this is that >> if I have spark.executor.instances present in spark-defaults.conf, I cannot >> figure out any way to spark-submit a job with >> spark.dynamicAllocation.enabled=true without getting this error. That is, >> even if I pass "-c spark.executor.instances=0 -c >> spark.dynamicAllocation.enabled=true", I still get this error because the >> validation in ClientArguments.parseArgs() that's checking for this >> condition simply checks for the presence of spark.executor.instances rather >> than whether or not its value is > 0. >> >> Should the check be changed to allow spark.executor.instances to be set >> to 0 if spark.dynamicAllocation.enabled is true? That would be an OK >> compromise, but I'd really prefer to be able to enable dynamicAllocation >> simply by setting spark.dynamicAllocation.enabled=true rather than by also >> having to set spark.executor.instances to 0. >> >> >> Thanks, >> >> Jonathan >> > >