Hi Patrick,

The fix you need is SPARK-6954: https://github.com/apache/spark/pull/5704.
If possible, you may cherry-pick the following commit into your Spark
deployment and it should resolve the issue:

https://github.com/apache/spark/commit/98ac39d2f5828fbdad8c9a4e563ad1169e3b9948

Note that this commit is only for the 1.3 branch. If you could upgrade to
1.4.0 then you do not need to apply that commit yourself.

-Andrew



2015-06-13 12:01 GMT-07:00 Patrick Woody <patrick.woo...@gmail.com>:

> Hey Sandy,
>
> I'll test it out on 1.4. Do you have a bug number or PR that I could
> reference as well?
>
> Thanks!
> -Pat
>
> Sent from my iPhone
>
> On Jun 13, 2015, at 11:38 AM, Sandy Ryza <sandy.r...@cloudera.com> wrote:
>
> Hi Patrick,
>
> I'm noticing that you're using Spark 1.3.1.  We fixed a bug in dynamic
> allocation in 1.4 that permitted requesting negative numbers of executors.
> Any chance you'd be able to try with the newer version and see if the
> problem persists?
>
> -Sandy
>
> On Fri, Jun 12, 2015 at 7:42 PM, Patrick Woody <patrick.woo...@gmail.com>
> wrote:
>
>> Hey all,
>>
>> I've recently run into an issue where spark dynamicAllocation has asked
>> for -1 executors from YARN. Unfortunately, this raises an exception that
>> kills the executor-allocation thread and the application can't request more
>> resources.
>>
>> Has anyone seen this before? It is spurious and the application usually
>> works, but when this gets hit it becomes unusable when getting stuck at
>> minimum YARN resources.
>>
>> Stacktrace below.
>>
>> Thanks!
>> -Pat
>>
>> 470 ERROR [2015-06-12 16:44:39,724] org.apache.spark.util.Utils: Uncaught
>> exception in thread spark-dynamic-executor-allocation-0
>> 471 ! java.lang.IllegalArgumentException: Attempted to request a negative
>> number of executor(s) -1 from the cluster manager. Please specify a
>> positive number!
>> 472 ! at
>> org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.requestTotalExecutors(CoarseGrainedSchedulerBackend.scala:338)
>> ~[spark-core_2.10-1.3.1.jar:1.
>> 473 ! at
>> org.apache.spark.SparkContext.requestTotalExecutors(SparkContext.scala:1137)
>> ~[spark-core_2.10-1.3.1.jar:1.3.1]
>> 474 ! at
>> org.apache.spark.ExecutorAllocationManager.addExecutors(ExecutorAllocationManager.scala:294)
>> ~[spark-core_2.10-1.3.1.jar:1.3.1]
>> 475 ! at
>> org.apache.spark.ExecutorAllocationManager.addOrCancelExecutorRequests(ExecutorAllocationManager.scala:263)
>> ~[spark-core_2.10-1.3.1.jar:1.3.1]
>> 476 ! at 
>> org.apache.spark.ExecutorAllocationManager.org$apache$spark$ExecutorAllocationManager$$schedule(ExecutorAllocationManager.scala:230)
>> ~[spark-core_2.10-1.3.1.j
>> 477 ! at
>> org.apache.spark.ExecutorAllocationManager$$anon$1$$anonfun$run$1.apply$mcV$sp(ExecutorAllocationManager.scala:189)
>> ~[spark-core_2.10-1.3.1.jar:1.3.1]
>> 478 ! at
>> org.apache.spark.ExecutorAllocationManager$$anon$1$$anonfun$run$1.apply(ExecutorAllocationManager.scala:189)
>> ~[spark-core_2.10-1.3.1.jar:1.3.1]
>> 479 ! at
>> org.apache.spark.ExecutorAllocationManager$$anon$1$$anonfun$run$1.apply(ExecutorAllocationManager.scala:189)
>> ~[spark-core_2.10-1.3.1.jar:1.3.1]
>> 480 ! at
>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618)
>> ~[spark-core_2.10-1.3.1.jar:1.3.1]
>> 481 ! at
>> org.apache.spark.ExecutorAllocationManager$$anon$1.run(ExecutorAllocationManager.scala:189)
>> [spark-core_2.10-1.3.1.jar:1.3.1]
>> 482 ! at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>> [na:1.7.0_71]
>> 483 ! at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>> [na:1.7.0_71]
>> 484 ! at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>> [na:1.7.0_71]
>> 485 ! at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>> [na:1.7.0_71]
>> 486 ! at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> [na:1.7.0_71]
>> 487 ! at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> [na:1.7.0_71]
>>
>
>

Reply via email to