(Adding spark user list)

Hi Tom,

If I understand correctly you're saying that you're running into memory
problems because the scheduler is allocating too much CPUs and not enough
memory to acoomodate them right?

In the case of fine grain mode I don't think that's a problem since we have
a fixed amount of CPU and memory per task.
However, in coarse grain you can run into that problem if you're with in
the spark.cores.max limit, and memory is a fixed number.

I have a patch out to configure how much max cpus should coarse grain
executor use, and it also allows multiple executors in coarse grain mode.
So you could say try to launch multiples of max 4 cores with
spark.executor.memory (+ overhead and etc) in a slave. (
https://github.com/apache/spark/pull/4027)

It also might be interesting to include a cores to memory multiplier so
that with a larger amount of cores we try to scale the memory with some
factor, but I'm not entirely sure that's intuitive to use and what people
know what to set it to, as that can likely change with different workload.

Tim







On Sat, Apr 11, 2015 at 9:51 AM, Tom Arnfeld <t...@duedil.com> wrote:

> We're running Spark 1.3.0 (with a couple of patches over the top for
> docker related bits).
>
> I don't think SPARK-4158 is related to what we're seeing, things do run
> fine on the cluster, given a ridiculously large executor memory
> configuration. As for SPARK-3535 although that looks useful I think we'e
> seeing something else.
>
> Put a different way, the amount of memory required at any given time by
> the spark JVM process is directly proportional to the amount of CPU it has,
> because more CPU means more tasks and more tasks means more memory. Even if
> we're using coarse mode, the amount of executor memory should be
> proportionate to the amount of CPUs in the offer.
>
> On 11 April 2015 at 17:39, Brenden Matthews <bren...@diddyinc.com> wrote:
>
>> I ran into some issues with it a while ago, and submitted a couple PRs to
>> fix it:
>>
>> https://github.com/apache/spark/pull/2401
>> https://github.com/apache/spark/pull/3024
>>
>> Do these look relevant? What version of Spark are you running?
>>
>> On Sat, Apr 11, 2015 at 9:33 AM, Tom Arnfeld <t...@duedil.com> wrote:
>>
>>> Hey,
>>>
>>> Not sure whether it's best to ask this on the spark mailing list or the
>>> mesos one, so I'll try here first :-)
>>>
>>> I'm having a bit of trouble with out of memory errors in my spark
>>> jobs... it seems fairly odd to me that memory resources can only be set at
>>> the executor level, and not also at the task level. For example, as far as
>>> I can tell there's only a *spark.executor.memory* config option.
>>>
>>> Surely the memory requirements of a single executor are quite
>>> dramatically influenced by the number of concurrent tasks running? Given a
>>> shared cluster, I have no idea what % of an individual slave my executor is
>>> going to get, so I basically have to set the executor memory to a value
>>> that's correct when the whole machine is in use...
>>>
>>> Has anyone else running Spark on Mesos come across this, or maybe
>>> someone could correct my understanding of the config options?
>>>
>>> Thanks!
>>>
>>> Tom.
>>>
>>
>>
>

Reply via email to