Hi Axel,

You can try setting `spark.deploy.spreadOut` to false (through your
conf/spark-defaults.conf file). What this does is essentially try to
schedule as many cores on one worker as possible before spilling over to
other workers. Note that you *must* restart the cluster through the sbin
scripts.

For more information see:
http://spark.apache.org/docs/latest/spark-standalone.html.

Feel free to let me know whether it works,
-Andrew


2015-08-18 4:49 GMT-07:00 Igor Berman <igor.ber...@gmail.com>:

> by default standalone creates 1 executor on every worker machine per
> application
> number of overall cores is configured with --total-executor-cores
> so in general if you'll specify --total-executor-cores=1 then there would
> be only 1 core on some executor and you'll get what you want
>
> on the other hand, if you application needs all cores of your cluster and
> only some specific job should run on single executor there are few methods
> to achieve this
> e.g. coallesce(1) or dummyRddWithOnePartitionOnly.foreachPartition
>
>
> On 18 August 2015 at 01:36, Axel Dahl <a...@whisperstream.com> wrote:
>
>> I have a 4 node cluster and have been playing around with the
>> num-executors parameters, executor-memory and executor-cores
>>
>> I set the following:
>> --executor-memory=10G
>> --num-executors=1
>> --executor-cores=8
>>
>> But when I run the job, I see that each worker, is running one executor
>> which has  2 cores and 2.5G memory.
>>
>> What I'd like to do instead is have Spark just allocate the job to a
>> single worker node?
>>
>> Is that possible in standalone mode or do I need a job/resource scheduler
>> like Yarn to do that?
>>
>> Thanks in advance,
>>
>> -Axel
>>
>>
>>
>

Reply via email to