got it, thanks for clarifying!

On Thu, Feb 2, 2017 at 2:57 PM, Michael Gummelt <mgumm...@mesosphere.io>
wrote:

> Yes, that's expected.  spark.executor.cores sizes a single executor.  It
> doesn't limit the number of executors.  For that, you need spark.cores.max
> (--total-executor-cores).
>
> And rdd.parallelize does not specify the number of executors.  It
> specifies the number of partitions, which relates to the number of tasks,
> not executors.  Unless you're running with dynamic allocation enabled, the
> number of executors for your job is static, and determined at start time.
> It's not influenced by your job itself.
>
>
> On Thu, Feb 2, 2017 at 2:42 PM, Ji Yan <ji...@drive.ai> wrote:
>
>> I tried setting spark.executor.cores per executor, but Spark seems to be
>> spinning up as many executors as possible up to spark.cores.max or however
>> many cpu cores available on the cluster, and this may be undesirable
>> because the number of executors in rdd.parallelize(collection, # of
>> partitions) is being overriden
>>
>> On Thu, Feb 2, 2017 at 1:30 PM, Michael Gummelt <mgumm...@mesosphere.io>
>> wrote:
>>
>>> As of Spark 2.0, Mesos mode does support setting cores on the executor
>>> level, but you might need to set the property directly (--conf
>>> spark.executor.cores=<cores>).  I've written about this here:
>>> https://docs.mesosphere.com/1.8/usage/service-guides/spark/j
>>> ob-scheduling/.  That doc is for DC/OS, but the configuration is the
>>> same.
>>>
>>> On Thu, Feb 2, 2017 at 1:06 PM, Ji Yan <ji...@drive.ai> wrote:
>>>
>>>> I was mainly confused why this is the case with memory, but with cpu
>>>> cores, it is not specified on per executor level
>>>>
>>>> On Thu, Feb 2, 2017 at 1:02 PM, Michael Gummelt <mgumm...@mesosphere.io
>>>> > wrote:
>>>>
>>>>> It sounds like you've answered your own question, right?
>>>>> --executor-memory means the memory per executor.  If you have no executor
>>>>> w/ 200GB memory, then the driver will accept no offers.
>>>>>
>>>>> On Thu, Feb 2, 2017 at 1:01 PM, Ji Yan <ji...@drive.ai> wrote:
>>>>>
>>>>>> sorry, to clarify, i was using --executor-memory for memory,
>>>>>> and --total-executor-cores for cpu cores
>>>>>>
>>>>>> On Thu, Feb 2, 2017 at 12:56 PM, Michael Gummelt <
>>>>>> mgumm...@mesosphere.io> wrote:
>>>>>>
>>>>>>> What CLI args are your referring to?  I'm aware of spark-submit's
>>>>>>> arguments (--executor-memory, --total-executor-cores, and 
>>>>>>> --executor-cores)
>>>>>>>
>>>>>>> On Thu, Feb 2, 2017 at 12:41 PM, Ji Yan <ji...@drive.ai> wrote:
>>>>>>>
>>>>>>>> I have done a experiment on this today. It shows that only CPUs are
>>>>>>>> tolerant of insufficient cluster size when a job starts. On my 
>>>>>>>> cluster, I
>>>>>>>> have 180Gb of memory and 64 cores, when I run spark-submit ( on mesos )
>>>>>>>> with --cpu_cores set to 1000, the job starts up with 64 cores. but 
>>>>>>>> when I
>>>>>>>> set --memory to 200Gb, the job fails to start with "Initial job
>>>>>>>> has not accepted any resources; check your cluster UI to ensure that
>>>>>>>> workers are registered and have sufficient resources"
>>>>>>>>
>>>>>>>> Also it is confusing to me that --cpu_cores specifies the number of
>>>>>>>> cpu cores across all executors, but --memory specifies per executor 
>>>>>>>> memory
>>>>>>>> requirement.
>>>>>>>>
>>>>>>>> On Mon, Jan 30, 2017 at 11:34 AM, Michael Gummelt <
>>>>>>>> mgumm...@mesosphere.io> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jan 30, 2017 at 9:47 AM, Ji Yan <ji...@drive.ai> wrote:
>>>>>>>>>
>>>>>>>>>> Tasks begin scheduling as soon as the first executor comes up
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks all for the clarification. Is this the default behavior of
>>>>>>>>>> Spark on Mesos today? I think this is what we are looking for because
>>>>>>>>>> sometimes a job can take up lots of resources and later jobs could 
>>>>>>>>>> not get
>>>>>>>>>> all the resources that it asks for. If a Spark job starts with only a
>>>>>>>>>> subset of resources that it asks for, does it know to expand its 
>>>>>>>>>> resources
>>>>>>>>>> later when more resources become available?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Launch each executor with at least 1GB RAM, but if mesos offers
>>>>>>>>>>> 2GB at some moment, then launch an executor with 2GB RAM
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This is less useful in our use case. But I am also quite
>>>>>>>>>> interested in cases in which this could be helpful. I think this 
>>>>>>>>>> will also
>>>>>>>>>> help with overall resource utilization on the cluster if when 
>>>>>>>>>> another job
>>>>>>>>>> starts up that has a hard requirement on resources, the extra 
>>>>>>>>>> resources to
>>>>>>>>>> the first job can be flexibly re-allocated to the second job.
>>>>>>>>>>
>>>>>>>>>> On Sat, Jan 28, 2017 at 2:32 PM, Michael Gummelt <
>>>>>>>>>> mgumm...@mesosphere.io> wrote:
>>>>>>>>>>
>>>>>>>>>>> We've talked about that, but it hasn't become a priority because
>>>>>>>>>>> we haven't had a driving use case.  If anyone has a good argument 
>>>>>>>>>>> for
>>>>>>>>>>> "variable" resource allocation like this, please let me know.
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Jan 28, 2017 at 9:17 AM, Shuai Lin <
>>>>>>>>>>> linshuai2...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> An alternative behavior is to launch the job with the best
>>>>>>>>>>>>> resource offer Mesos is able to give
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Michael has just made an excellent explanation about dynamic
>>>>>>>>>>>> allocation support in mesos. But IIUC, what you want to achieve is
>>>>>>>>>>>> something like (using RAM as an example) : "Launch each executor 
>>>>>>>>>>>> with at
>>>>>>>>>>>> least 1GB RAM, but if mesos offers 2GB at some moment, then launch 
>>>>>>>>>>>> an
>>>>>>>>>>>> executor with 2GB RAM".
>>>>>>>>>>>>
>>>>>>>>>>>> I wonder what's benefit of that? To reduce the "resource
>>>>>>>>>>>> fragmentation"?
>>>>>>>>>>>>
>>>>>>>>>>>> Anyway, that is not supported at this moment. In all the
>>>>>>>>>>>> supported cluster managers of spark (mesos, yarn, standalone, and 
>>>>>>>>>>>> the
>>>>>>>>>>>> up-to-coming spark on kubernetes), you have to specify the cores 
>>>>>>>>>>>> and memory
>>>>>>>>>>>> of each executor.
>>>>>>>>>>>>
>>>>>>>>>>>> It may not be supported in the future, because only mesos has
>>>>>>>>>>>> the concepts of offers because of its two-level scheduling model.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Jan 28, 2017 at 1:35 AM, Ji Yan <ji...@drive.ai> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Dear Spark Users,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Currently is there a way to dynamically allocate resources to
>>>>>>>>>>>>> Spark on Mesos? Within Spark we can specify the CPU cores, memory 
>>>>>>>>>>>>> before
>>>>>>>>>>>>> running job. The way I understand is that the Spark job will not 
>>>>>>>>>>>>> run if the
>>>>>>>>>>>>> CPU/Mem requirement is not met. This may lead to decrease in 
>>>>>>>>>>>>> overall
>>>>>>>>>>>>> utilization of the cluster. An alternative behavior is to launch 
>>>>>>>>>>>>> the job
>>>>>>>>>>>>> with the best resource offer Mesos is able to give. Is this 
>>>>>>>>>>>>> possible with
>>>>>>>>>>>>> the current implementation?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> Ji
>>>>>>>>>>>>>
>>>>>>>>>>>>> The information in this email is confidential and may be
>>>>>>>>>>>>> legally privileged. It is intended solely for the addressee. 
>>>>>>>>>>>>> Access to this
>>>>>>>>>>>>> email by anyone else is unauthorized. If you are not the intended
>>>>>>>>>>>>> recipient, any disclosure, copying, distribution or any action 
>>>>>>>>>>>>> taken or
>>>>>>>>>>>>> omitted to be taken in reliance on it, is prohibited and may be 
>>>>>>>>>>>>> unlawful.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Michael Gummelt
>>>>>>>>>>> Software Engineer
>>>>>>>>>>> Mesosphere
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The information in this email is confidential and may be legally
>>>>>>>>>> privileged. It is intended solely for the addressee. Access to this 
>>>>>>>>>> email
>>>>>>>>>> by anyone else is unauthorized. If you are not the intended 
>>>>>>>>>> recipient, any
>>>>>>>>>> disclosure, copying, distribution or any action taken or omitted to 
>>>>>>>>>> be
>>>>>>>>>> taken in reliance on it, is prohibited and may be unlawful.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Michael Gummelt
>>>>>>>>> Software Engineer
>>>>>>>>> Mesosphere
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The information in this email is confidential and may be legally
>>>>>>>> privileged. It is intended solely for the addressee. Access to this 
>>>>>>>> email
>>>>>>>> by anyone else is unauthorized. If you are not the intended recipient, 
>>>>>>>> any
>>>>>>>> disclosure, copying, distribution or any action taken or omitted to be
>>>>>>>> taken in reliance on it, is prohibited and may be unlawful.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Michael Gummelt
>>>>>>> Software Engineer
>>>>>>> Mesosphere
>>>>>>>
>>>>>>
>>>>>>
>>>>>> The information in this email is confidential and may be legally
>>>>>> privileged. It is intended solely for the addressee. Access to this email
>>>>>> by anyone else is unauthorized. If you are not the intended recipient, 
>>>>>> any
>>>>>> disclosure, copying, distribution or any action taken or omitted to be
>>>>>> taken in reliance on it, is prohibited and may be unlawful.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Michael Gummelt
>>>>> Software Engineer
>>>>> Mesosphere
>>>>>
>>>>
>>>>
>>>> The information in this email is confidential and may be legally
>>>> privileged. It is intended solely for the addressee. Access to this email
>>>> by anyone else is unauthorized. If you are not the intended recipient, any
>>>> disclosure, copying, distribution or any action taken or omitted to be
>>>> taken in reliance on it, is prohibited and may be unlawful.
>>>>
>>>
>>>
>>>
>>> --
>>> Michael Gummelt
>>> Software Engineer
>>> Mesosphere
>>>
>>
>>
>> The information in this email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful.
>>
>
>
>
> --
> Michael Gummelt
> Software Engineer
> Mesosphere
>



-- 
iPhone. iTypos. iApologize.

Reply via email to