Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

2018-04-03 Thread Kimoon Kim
> I'm also wondering if we should support running in other QoS classes -
https://kubernetes.io/docs/tasks/configure-pod-container/
quality-service-pod/#qos-classes, like maybe best-effort as well
i.e. launching in a configuration that has neither the limit nor the
request specified. I haven't seen a use-case but I can imagine this is a
way for people to achieve better utilization with low priority long-running
jobs.

That's interesting. Like you said, it can be a good option for low priority
jobs. But I wonder this has implication on the recovery code inside the
driver. When many executor pods get killed by K8s because the job was
BestEffort, we probably want the driver to tolerate the higher number of
casualty. So that the job could still go on and eventually finish.

Thanks,
Kimoon

On Mon, Apr 2, 2018 at 11:11 AM, Anirudh Ramanathan 
wrote:

> In summary, it looks like a combination of David's (#20943
> <https://github.com/apache/spark/pull/20943>) and Yinan's PR (#20553
> <https://github.com/apache/spark/pull/20553>) are good solutions here.
> Agreed on the importance of requesting memoryoverhead up front.
>
> I'm also wondering if we should support running in other QoS classes -
> https://kubernetes.io/docs/tasks/configure-pod-container/
> quality-service-pod/#qos-classes, like maybe best-effort as well
> i.e. launching in a configuration that has neither the limit nor the
> request specified. I haven't seen a use-case but I can imagine this is a
> way for people to achieve better utilization with low priority long-running
> jobs.
>
> On Fri, Mar 30, 2018 at 3:06 PM Yinan Li  wrote:
>
>> Yes, the PR allows you to set say 1.5. The New configuration property
>> defaults to spark.executor.cores, which defaults to 1.
>>
>> On Fri, Mar 30, 2018, 3:03 PM Kimoon Kim  wrote:
>>
>>> David, glad it helped! And thanks for your clear example.
>>>
>>> > The only remaining question would then be what a sensible default for
>>> *spark.kubernetes.executor.cores *would be. Seeing that I wanted more
>>> than 1 and Yinan wants less, leaving it at 1 night be best.
>>>
>>> 1 as default SGTM.
>>>
>>> Thanks,
>>> Kimoon
>>>
>>> On Fri, Mar 30, 2018 at 1:38 PM, David Vogelbacher <
>>> dvogelbac...@palantir.com> wrote:
>>>
>>>> Thanks for linking that PR Kimoon.
>>>>
>>>>
>>>> It actually does mostly address the issue I was referring to. As the
>>>> issue <https://github.com/apache-spark-on-k8s/spark/issues/352> I
>>>> linked in my first email states, one physical cpu might not be enough to
>>>> execute a task in a performant way.
>>>>
>>>>
>>>>
>>>> So if I set *spark.executor.cores=1* and *spark.task.cpus=1* , I will
>>>> get 1 core from Kubernetes and execute one task per Executor and run into
>>>> performance problems.
>>>>
>>>> Being able to specify `spark.kubernetes.executor.cores=1.2` would fix
>>>> the issue (1.2 is just an example).
>>>>
>>>> I am curious as to why you, Yinan, would want to use this property to
>>>> request less than 1 physical cpu (that is how it sounds to me on the PR).
>>>>
>>>> Do you have testing that indicates that less than 1 physical CPU is
>>>> enough for executing tasks?
>>>>
>>>>
>>>>
>>>> In the end it boils down to the question proposed by Yinan:
>>>>
>>>> > A relevant question is should Spark on Kubernetes really be
>>>> opinionated on how to set the cpu request and limit and even try to
>>>> determine this automatically?
>>>>
>>>>
>>>>
>>>> And I completely agree with your answer Kimoon, we should provide
>>>> sensible defaults and make it configurable, as Yinan’s PR does.
>>>>
>>>> The only remaining question would then be what a sensible default for 
>>>> *spark.kubernetes.executor.cores
>>>> *would be. Seeing that I wanted more than 1 and Yinan wants less,
>>>> leaving it at 1 night be best.
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> David
>>>>
>>>>
>>>>
>>>> *From: *Kimoon Kim 
>>>> *Date: *Friday, March 30, 2018 at 4:28 PM
>>>> *To: *Yinan Li 
>>>> *Cc: *David Vogelbacher , "
>>>> dev@spark.apache.org" 
>>>> *Subject: *Re: [Kubernetes] Resource requests and

Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

2018-04-02 Thread Anirudh Ramanathan
In summary, it looks like a combination of David's (#20943
<https://github.com/apache/spark/pull/20943>) and Yinan's PR (#20553
<https://github.com/apache/spark/pull/20553>) are good solutions here.
Agreed on the importance of requesting memoryoverhead up front.

I'm also wondering if we should support running in other QoS classes -
https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#qos-classes,
like maybe best-effort as well
i.e. launching in a configuration that has neither the limit nor the
request specified. I haven't seen a use-case but I can imagine this is a
way for people to achieve better utilization with low priority long-running
jobs.

On Fri, Mar 30, 2018 at 3:06 PM Yinan Li  wrote:

> Yes, the PR allows you to set say 1.5. The New configuration property
> defaults to spark.executor.cores, which defaults to 1.
>
> On Fri, Mar 30, 2018, 3:03 PM Kimoon Kim  wrote:
>
>> David, glad it helped! And thanks for your clear example.
>>
>> > The only remaining question would then be what a sensible default for
>> *spark.kubernetes.executor.cores *would be. Seeing that I wanted more
>> than 1 and Yinan wants less, leaving it at 1 night be best.
>>
>> 1 as default SGTM.
>>
>> Thanks,
>> Kimoon
>>
>> On Fri, Mar 30, 2018 at 1:38 PM, David Vogelbacher <
>> dvogelbac...@palantir.com> wrote:
>>
>>> Thanks for linking that PR Kimoon.
>>>
>>>
>>> It actually does mostly address the issue I was referring to. As the
>>> issue <https://github.com/apache-spark-on-k8s/spark/issues/352> I
>>> linked in my first email states, one physical cpu might not be enough to
>>> execute a task in a performant way.
>>>
>>>
>>>
>>> So if I set *spark.executor.cores=1* and *spark.task.cpus=1* , I will
>>> get 1 core from Kubernetes and execute one task per Executor and run into
>>> performance problems.
>>>
>>> Being able to specify `spark.kubernetes.executor.cores=1.2` would fix
>>> the issue (1.2 is just an example).
>>>
>>> I am curious as to why you, Yinan, would want to use this property to
>>> request less than 1 physical cpu (that is how it sounds to me on the PR).
>>>
>>> Do you have testing that indicates that less than 1 physical CPU is
>>> enough for executing tasks?
>>>
>>>
>>>
>>> In the end it boils down to the question proposed by Yinan:
>>>
>>> > A relevant question is should Spark on Kubernetes really be
>>> opinionated on how to set the cpu request and limit and even try to
>>> determine this automatically?
>>>
>>>
>>>
>>> And I completely agree with your answer Kimoon, we should provide
>>> sensible defaults and make it configurable, as Yinan’s PR does.
>>>
>>> The only remaining question would then be what a sensible default for 
>>> *spark.kubernetes.executor.cores
>>> *would be. Seeing that I wanted more than 1 and Yinan wants less,
>>> leaving it at 1 night be best.
>>>
>>>
>>>
>>> Thanks,
>>>
>>> David
>>>
>>>
>>>
>>> *From: *Kimoon Kim 
>>> *Date: *Friday, March 30, 2018 at 4:28 PM
>>> *To: *Yinan Li 
>>> *Cc: *David Vogelbacher , "
>>> dev@spark.apache.org" 
>>> *Subject: *Re: [Kubernetes] Resource requests and limits for Driver and
>>> Executor Pods
>>>
>>>
>>>
>>> I see. Good to learn the interaction between spark.task.cpus and
>>> spark.executor.cores. But am I right to say that PR #20553 can be still
>>> used as an additional knob on top of those two? Say a user wants 1.5 core
>>> per executor from Kubernetes, not the rounded up integer value 2?
>>>
>>>
>>>
>>> > A relevant question is should Spark on Kubernetes really be
>>> opinionated on how to set the cpu request and limit and even try to
>>> determine this automatically?
>>>
>>>
>>>
>>> Personally, I don't see how this can be auto-determined at all. I think
>>> the best we can do is to come up with sensible default values for the most
>>> common case, and provide and well-document other knobs for edge cases.
>>>
>>>
>>> Thanks,
>>>
>>> Kimoon
>>>
>>>
>>>
>>> On Fri, Mar 30, 2018 at 12:37 PM, Yinan Li  wrote:
>>>
>>> PR #20553 [github.com]
>>> <https://urldefense.proofpoint.com

Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

2018-03-30 Thread Yinan Li
Yes, the PR allows you to set say 1.5. The New configuration property
defaults to spark.executor.cores, which defaults to 1.

On Fri, Mar 30, 2018, 3:03 PM Kimoon Kim  wrote:

> David, glad it helped! And thanks for your clear example.
>
> > The only remaining question would then be what a sensible default for
> *spark.kubernetes.executor.cores *would be. Seeing that I wanted more
> than 1 and Yinan wants less, leaving it at 1 night be best.
>
> 1 as default SGTM.
>
> Thanks,
> Kimoon
>
> On Fri, Mar 30, 2018 at 1:38 PM, David Vogelbacher <
> dvogelbac...@palantir.com> wrote:
>
>> Thanks for linking that PR Kimoon.
>>
>>
>> It actually does mostly address the issue I was referring to. As the
>> issue <https://github.com/apache-spark-on-k8s/spark/issues/352> I linked
>> in my first email states, one physical cpu might not be enough to execute a
>> task in a performant way.
>>
>>
>>
>> So if I set *spark.executor.cores=1* and *spark.task.cpus=1* , I will
>> get 1 core from Kubernetes and execute one task per Executor and run into
>> performance problems.
>>
>> Being able to specify `spark.kubernetes.executor.cores=1.2` would fix the
>> issue (1.2 is just an example).
>>
>> I am curious as to why you, Yinan, would want to use this property to
>> request less than 1 physical cpu (that is how it sounds to me on the PR).
>>
>> Do you have testing that indicates that less than 1 physical CPU is
>> enough for executing tasks?
>>
>>
>>
>> In the end it boils down to the question proposed by Yinan:
>>
>> > A relevant question is should Spark on Kubernetes really be opinionated
>> on how to set the cpu request and limit and even try to determine this
>> automatically?
>>
>>
>>
>> And I completely agree with your answer Kimoon, we should provide
>> sensible defaults and make it configurable, as Yinan’s PR does.
>>
>> The only remaining question would then be what a sensible default for 
>> *spark.kubernetes.executor.cores
>> *would be. Seeing that I wanted more than 1 and Yinan wants less,
>> leaving it at 1 night be best.
>>
>>
>>
>> Thanks,
>>
>> David
>>
>>
>>
>> *From: *Kimoon Kim 
>> *Date: *Friday, March 30, 2018 at 4:28 PM
>> *To: *Yinan Li 
>> *Cc: *David Vogelbacher , "
>> dev@spark.apache.org" 
>> *Subject: *Re: [Kubernetes] Resource requests and limits for Driver and
>> Executor Pods
>>
>>
>>
>> I see. Good to learn the interaction between spark.task.cpus and
>> spark.executor.cores. But am I right to say that PR #20553 can be still
>> used as an additional knob on top of those two? Say a user wants 1.5 core
>> per executor from Kubernetes, not the rounded up integer value 2?
>>
>>
>>
>> > A relevant question is should Spark on Kubernetes really be opinionated
>> on how to set the cpu request and limit and even try to determine this
>> automatically?
>>
>>
>>
>> Personally, I don't see how this can be auto-determined at all. I think
>> the best we can do is to come up with sensible default values for the most
>> common case, and provide and well-document other knobs for edge cases.
>>
>>
>> Thanks,
>>
>> Kimoon
>>
>>
>>
>> On Fri, Mar 30, 2018 at 12:37 PM, Yinan Li  wrote:
>>
>> PR #20553 [github.com]
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_20553&d=DwMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=BFXcJr3WTIvmlY-gtaiCO5QK4bLix2sgwDDpPfrZKoE&m=TrCA4oIVKyN3M_ExqpHr7bbhi14uvoEaspPwclIJI4M&s=jqIG5lO5tnV3K3SDPPxw2bEHs0i6cltoaLh8K39JTTQ&e=>
>>  is
>> more for allowing users to use a fractional value for cpu requests. The
>> existing spark.executor.cores is sufficient for specifying more than one
>> cpus.
>>
>>
>>
>> > One way to solve this could be to request more than 1 core from
>> Kubernetes per task. The exact amount we should request is unclear to me
>> (it largely depends on how many threads actually get spawned for a task).
>>
>> A good indication is spark.task.cpus, and on average how many tasks are
>> expected to run by a single executor at any point in time. If each executor
>> is only expected to run one task at most at any point in time,
>> spark.executor.cores can be set to be equal to spark.task.cpus.
>>
>> A relevant question is should Spark on Kubernetes really be opinionated
>> on how to set the cpu request and limi

Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

2018-03-30 Thread Kimoon Kim
David, glad it helped! And thanks for your clear example.

> The only remaining question would then be what a sensible default for
*spark.kubernetes.executor.cores *would be. Seeing that I wanted more than
1 and Yinan wants less, leaving it at 1 night be best.

1 as default SGTM.

Thanks,
Kimoon

On Fri, Mar 30, 2018 at 1:38 PM, David Vogelbacher <
dvogelbac...@palantir.com> wrote:

> Thanks for linking that PR Kimoon.
>
>
> It actually does mostly address the issue I was referring to. As the issue
> <https://github.com/apache-spark-on-k8s/spark/issues/352> I linked in my
> first email states, one physical cpu might not be enough to execute a task
> in a performant way.
>
>
>
> So if I set *spark.executor.cores=1* and *spark.task.cpus=1* , I will get
> 1 core from Kubernetes and execute one task per Executor and run into
> performance problems.
>
> Being able to specify `spark.kubernetes.executor.cores=1.2` would fix the
> issue (1.2 is just an example).
>
> I am curious as to why you, Yinan, would want to use this property to
> request less than 1 physical cpu (that is how it sounds to me on the PR).
>
> Do you have testing that indicates that less than 1 physical CPU is enough
> for executing tasks?
>
>
>
> In the end it boils down to the question proposed by Yinan:
>
> > A relevant question is should Spark on Kubernetes really be opinionated
> on how to set the cpu request and limit and even try to determine this
> automatically?
>
>
>
> And I completely agree with your answer Kimoon, we should provide sensible
> defaults and make it configurable, as Yinan’s PR does.
>
> The only remaining question would then be what a sensible default for 
> *spark.kubernetes.executor.cores
> *would be. Seeing that I wanted more than 1 and Yinan wants less, leaving
> it at 1 night be best.
>
>
>
> Thanks,
>
> David
>
>
>
> *From: *Kimoon Kim 
> *Date: *Friday, March 30, 2018 at 4:28 PM
> *To: *Yinan Li 
> *Cc: *David Vogelbacher , "dev@spark.apache.org"
> 
> *Subject: *Re: [Kubernetes] Resource requests and limits for Driver and
> Executor Pods
>
>
>
> I see. Good to learn the interaction between spark.task.cpus and
> spark.executor.cores. But am I right to say that PR #20553 can be still
> used as an additional knob on top of those two? Say a user wants 1.5 core
> per executor from Kubernetes, not the rounded up integer value 2?
>
>
>
> > A relevant question is should Spark on Kubernetes really be opinionated
> on how to set the cpu request and limit and even try to determine this
> automatically?
>
>
>
> Personally, I don't see how this can be auto-determined at all. I think
> the best we can do is to come up with sensible default values for the most
> common case, and provide and well-document other knobs for edge cases.
>
>
> Thanks,
>
> Kimoon
>
>
>
> On Fri, Mar 30, 2018 at 12:37 PM, Yinan Li  wrote:
>
> PR #20553 [github.com]
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_20553&d=DwMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=BFXcJr3WTIvmlY-gtaiCO5QK4bLix2sgwDDpPfrZKoE&m=TrCA4oIVKyN3M_ExqpHr7bbhi14uvoEaspPwclIJI4M&s=jqIG5lO5tnV3K3SDPPxw2bEHs0i6cltoaLh8K39JTTQ&e=>
>  is
> more for allowing users to use a fractional value for cpu requests. The
> existing spark.executor.cores is sufficient for specifying more than one
> cpus.
>
>
>
> > One way to solve this could be to request more than 1 core from
> Kubernetes per task. The exact amount we should request is unclear to me
> (it largely depends on how many threads actually get spawned for a task).
>
> A good indication is spark.task.cpus, and on average how many tasks are
> expected to run by a single executor at any point in time. If each executor
> is only expected to run one task at most at any point in time,
> spark.executor.cores can be set to be equal to spark.task.cpus.
>
> A relevant question is should Spark on Kubernetes really be opinionated on
> how to set the cpu request and limit and even try to determine this
> automatically?
>
>
>
> On Fri, Mar 30, 2018 at 11:40 AM, Kimoon Kim 
> wrote:
>
> > Instead of requesting `[driver,executor].memory`, we should just request
> `[driver,executor].memory + [driver,executor].memoryOverhead `. I think
> this case is a bit clearer than the CPU case, so I went ahead and filed an 
> issue
> [issues.apache.org]
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SPARK-2D23825&d=DwMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=BFXcJr3WTIvmlY-gtaiCO5QK4bLix2sgwDDpPfrZKoE&m=TrCA4oIVKyN3M_ExqpHr7bbh

Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

2018-03-30 Thread David Vogelbacher
Thanks for linking that PR Kimoon.


It actually does mostly address the issue I was referring to. As the issue I 
linked in my first email states, one physical cpu might not be enough to 
execute a task in a performant way.

 

So if I set spark.executor.cores=1 and spark.task.cpus=1 , I will get 1 core 
from Kubernetes and execute one task per Executor and run into performance 
problems.

Being able to specify `spark.kubernetes.executor.cores=1.2` would fix the issue 
(1.2 is just an example).


I am curious as to why you, Yinan, would want to use this property to request 
less than 1 physical cpu (that is how it sounds to me on the PR). 

Do you have testing that indicates that less than 1 physical CPU is enough for 
executing tasks?

 

In the end it boils down to the question proposed by Yinan:

> A relevant question is should Spark on Kubernetes really be opinionated on 
> how to set the cpu request and limit and even try to determine this 
> automatically?

 

And I completely agree with your answer Kimoon, we should provide sensible 
defaults and make it configurable, as Yinan’s PR does. 

The only remaining question would then be what a sensible default for 
spark.kubernetes.executor.cores would be. Seeing that I wanted more than 1 and 
Yinan wants less, leaving it at 1 night be best.

 

Thanks,

David 

 

From: Kimoon Kim 
Date: Friday, March 30, 2018 at 4:28 PM
To: Yinan Li 
Cc: David Vogelbacher , "dev@spark.apache.org" 

Subject: Re: [Kubernetes] Resource requests and limits for Driver and Executor 
Pods

 

I see. Good to learn the interaction between spark.task.cpus and 
spark.executor.cores. But am I right to say that PR #20553 can be still used as 
an additional knob on top of those two? Say a user wants 1.5 core per executor 
from Kubernetes, not the rounded up integer value 2? 

 

> A relevant question is should Spark on Kubernetes really be opinionated on 
> how to set the cpu request and limit and even try to determine this 
> automatically?

 

Personally, I don't see how this can be auto-determined at all. I think the 
best we can do is to come up with sensible default values for the most common 
case, and provide and well-document other knobs for edge cases.


Thanks,

Kimoon

 

On Fri, Mar 30, 2018 at 12:37 PM, Yinan Li  wrote:

PR #20553 [github.com] is more for allowing users to use a fractional value for 
cpu requests. The existing spark.executor.cores is sufficient for specifying 
more than one cpus. 




> One way to solve this could be to request more than 1 core from Kubernetes 
> per task. The exact amount we should request is unclear to me (it largely 
> depends on how many threads actually get spawned for a task).

A good indication is spark.task.cpus, and on average how many tasks are 
expected to run by a single executor at any point in time. If each executor is 
only expected to run one task at most at any point in time, 
spark.executor.cores can be set to be equal to spark.task.cpus.

A relevant question is should Spark on Kubernetes really be opinionated on how 
to set the cpu request and limit and even try to determine this automatically?

 

On Fri, Mar 30, 2018 at 11:40 AM, Kimoon Kim  wrote:

> Instead of requesting `[driver,executor].memory`, we should just request 
> `[driver,executor].memory + [driver,executor].memoryOverhead `. I think this 
> case is a bit clearer than the CPU case, so I went ahead and filed an issue 
> [issues.apache.org] with more details and made a PR [github.com].

I think this suggestion makes sense. 

 

> One way to solve this could be to request more than 1 core from Kubernetes 
> per task. The exact amount we should request is unclear to me (it largely 
> depends on how many threads actually get spawned for a task).

 

I wonder if this is being addressed by PR #20553 [github.com] written by Yinan. 
Yinan? 


Thanks,

Kimoon

 

On Thu, Mar 29, 2018 at 5:14 PM, David Vogelbacher  
wrote:

Hi,

 

At the moment driver and executor pods are created using the following requests 
and limits:

 CPUMemory
Request[driver,executor].cores[driver,executor].memory
LimitUnlimited (but can be specified using 
spark.[driver,executor].cores)[driver,executor].memory + 
[driver,executor].memoryOverhead

 

Specifying the requests like this leads to problems if the pods only get the 
requested amount of resources and nothing of the optional (limit) resources, as 
it can happen in a fully utilized cluster.

 

For memory:

Let’s say we have a node with 100GiB memory and 5 pods with 20 GiB memory and 5 
GiB memoryOverhead. 

At the beginning all 5 pods use 20 GiB of memory and all is well. If a pod then 
starts using its overhead memory it will get killed as there is no more memory 
available, even though we told spark

that it can use 25 GiB of memory.

 

Instead of requesting `[driver,executor].memory`, we should just request 
`[driver,executor].memory + [driver,executor].memoryOv

Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

2018-03-30 Thread Kimoon Kim
I see. Good to learn the interaction between spark.task.cpus and
spark.executor.cores. But am I right to say that PR #20553 can be still
used as an additional knob on top of those two? Say a user wants 1.5 core
per executor from Kubernetes, not the rounded up integer value 2?

> A relevant question is should Spark on Kubernetes really be opinionated
on how to set the cpu request and limit and even try to determine this
automatically?

Personally, I don't see how this can be auto-determined at all. I think the
best we can do is to come up with sensible default values for the most
common case, and provide and well-document other knobs for edge cases.

Thanks,
Kimoon

On Fri, Mar 30, 2018 at 12:37 PM, Yinan Li  wrote:

> PR #20553  is more for
> allowing users to use a fractional value for cpu requests. The existing
> spark.executor.cores is sufficient for specifying more than one cpus.
>
> > One way to solve this could be to request more than 1 core from
> Kubernetes per task. The exact amount we should request is unclear to me
> (it largely depends on how many threads actually get spawned for a task).
>
> A good indication is spark.task.cpus, and on average how many tasks are
> expected to run by a single executor at any point in time. If each executor
> is only expected to run one task at most at any point in time,
> spark.executor.cores can be set to be equal to spark.task.cpus.
>
> A relevant question is should Spark on Kubernetes really be opinionated on
> how to set the cpu request and limit and even try to determine this
> automatically?
>
> On Fri, Mar 30, 2018 at 11:40 AM, Kimoon Kim 
> wrote:
>
>> > Instead of requesting `[driver,executor].memory`, we should just
>> request `[driver,executor].memory + [driver,executor].memoryOverhead `. I
>> think this case is a bit clearer than the CPU case, so I went ahead and
>> filed an issue  with
>> more details and made a PR .
>>
>> I think this suggestion makes sense.
>>
>> > One way to solve this could be to request more than 1 core from
>> Kubernetes per task. The exact amount we should request is unclear to me
>> (it largely depends on how many threads actually get spawned for a task).
>>
>> I wonder if this is being addressed by PR #20553
>>  written by Yinan. Yinan?
>>
>> Thanks,
>> Kimoon
>>
>> On Thu, Mar 29, 2018 at 5:14 PM, David Vogelbacher <
>> dvogelbac...@palantir.com> wrote:
>>
>>> Hi,
>>>
>>>
>>>
>>> At the moment driver and executor pods are created using the following
>>> requests and limits:
>>>
>>>
>>>
>>> *CPU*
>>>
>>> *Memory*
>>>
>>> *Request*
>>>
>>> [driver,executor].cores
>>>
>>> [driver,executor].memory
>>>
>>> *Limit*
>>>
>>> Unlimited (but can be specified using spark.[driver,executor].cores)
>>>
>>> [driver,executor].memory + [driver,executor].memoryOverhead
>>>
>>>
>>>
>>> Specifying the requests like this leads to problems if the pods only get
>>> the requested amount of resources and nothing of the optional (limit)
>>> resources, as it can happen in a fully utilized cluster.
>>>
>>>
>>>
>>> *For memory:*
>>>
>>> Let’s say we have a node with 100GiB memory and 5 pods with 20 GiB
>>> memory and 5 GiB memoryOverhead.
>>>
>>> At the beginning all 5 pods use 20 GiB of memory and all is well. If a
>>> pod then starts using its overhead memory it will get killed as there is no
>>> more memory available, even though we told spark
>>>
>>> that it can use 25 GiB of memory.
>>>
>>>
>>>
>>> Instead of requesting `[driver,executor].memory`, we should just request
>>> `[driver,executor].memory + [driver,executor].memoryOverhead `.
>>>
>>> I think this case is a bit clearer than the CPU case, so I went ahead
>>> and filed an issue 
>>> with more details and made a PR
>>> .
>>>
>>>
>>>
>>> *For CPU:*
>>>
>>> As it turns out, there can be performance problems if we only have
>>> `executor.cores` available (which means we have one core per task). This
>>> was raised here
>>>  and is the
>>> reason that the cpu limit was set to unlimited.
>>>
>>> This issue stems from the fact that in general there will be more than
>>> one thread per task, resulting in performance impacts if there is only one
>>> core available.
>>>
>>> However, I am not sure that just setting the limit to unlimited is the
>>> best solution because it means that even if the Kubernetes cluster can
>>> perfectly satisfy the resource requests, performance might be very bad.
>>>
>>>
>>>
>>> I think we should guarantee that an executor is able to do its work well
>>> (without performance issues or getting killed - as could happen in the
>>> memory case) with the resources it gets guaranteed from Kubernetes.
>>>
>>>
>>>
>>> One way to solve this could be

Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

2018-03-30 Thread Yinan Li
PR #20553  is more for allowing
users to use a fractional value for cpu requests. The existing
spark.executor.cores is sufficient for specifying more than one cpus.

> One way to solve this could be to request more than 1 core from
Kubernetes per task. The exact amount we should request is unclear to me
(it largely depends on how many threads actually get spawned for a task).

A good indication is spark.task.cpus, and on average how many tasks are
expected to run by a single executor at any point in time. If each executor
is only expected to run one task at most at any point in time,
spark.executor.cores can be set to be equal to spark.task.cpus.

A relevant question is should Spark on Kubernetes really be opinionated on
how to set the cpu request and limit and even try to determine this
automatically?

On Fri, Mar 30, 2018 at 11:40 AM, Kimoon Kim  wrote:

> > Instead of requesting `[driver,executor].memory`, we should just
> request `[driver,executor].memory + [driver,executor].memoryOverhead `. I
> think this case is a bit clearer than the CPU case, so I went ahead and
> filed an issue  with
> more details and made a PR .
>
> I think this suggestion makes sense.
>
> > One way to solve this could be to request more than 1 core from
> Kubernetes per task. The exact amount we should request is unclear to me
> (it largely depends on how many threads actually get spawned for a task).
>
> I wonder if this is being addressed by PR #20553
>  written by Yinan. Yinan?
>
> Thanks,
> Kimoon
>
> On Thu, Mar 29, 2018 at 5:14 PM, David Vogelbacher <
> dvogelbac...@palantir.com> wrote:
>
>> Hi,
>>
>>
>>
>> At the moment driver and executor pods are created using the following
>> requests and limits:
>>
>>
>>
>> *CPU*
>>
>> *Memory*
>>
>> *Request*
>>
>> [driver,executor].cores
>>
>> [driver,executor].memory
>>
>> *Limit*
>>
>> Unlimited (but can be specified using spark.[driver,executor].cores)
>>
>> [driver,executor].memory + [driver,executor].memoryOverhead
>>
>>
>>
>> Specifying the requests like this leads to problems if the pods only get
>> the requested amount of resources and nothing of the optional (limit)
>> resources, as it can happen in a fully utilized cluster.
>>
>>
>>
>> *For memory:*
>>
>> Let’s say we have a node with 100GiB memory and 5 pods with 20 GiB memory
>> and 5 GiB memoryOverhead.
>>
>> At the beginning all 5 pods use 20 GiB of memory and all is well. If a
>> pod then starts using its overhead memory it will get killed as there is no
>> more memory available, even though we told spark
>>
>> that it can use 25 GiB of memory.
>>
>>
>>
>> Instead of requesting `[driver,executor].memory`, we should just request
>> `[driver,executor].memory + [driver,executor].memoryOverhead `.
>>
>> I think this case is a bit clearer than the CPU case, so I went ahead and
>> filed an issue  with
>> more details and made a PR .
>>
>>
>>
>> *For CPU:*
>>
>> As it turns out, there can be performance problems if we only have
>> `executor.cores` available (which means we have one core per task). This
>> was raised here 
>> and is the reason that the cpu limit was set to unlimited.
>>
>> This issue stems from the fact that in general there will be more than
>> one thread per task, resulting in performance impacts if there is only one
>> core available.
>>
>> However, I am not sure that just setting the limit to unlimited is the
>> best solution because it means that even if the Kubernetes cluster can
>> perfectly satisfy the resource requests, performance might be very bad.
>>
>>
>>
>> I think we should guarantee that an executor is able to do its work well
>> (without performance issues or getting killed - as could happen in the
>> memory case) with the resources it gets guaranteed from Kubernetes.
>>
>>
>>
>> One way to solve this could be to request more than 1 core from
>> Kubernetes per task. The exact amount we should request is unclear to me
>> (it largely depends on how many threads actually get spawned for a task).
>>
>> We would need to find a way to determine this somehow automatically or at
>> least come up with a better default value than 1 core per task.
>>
>>
>>
>> Does somebody have ideas or thoughts on how to solve this best?
>>
>>
>>
>> Best,
>>
>> David
>>
>
>


Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

2018-03-30 Thread Kimoon Kim
> Instead of requesting `[driver,executor].memory`, we should just request
`[driver,executor].memory + [driver,executor].memoryOverhead `. I think
this case is a bit clearer than the CPU case, so I went ahead and filed an
issue  with more details
and made a PR .

I think this suggestion makes sense.

> One way to solve this could be to request more than 1 core from
Kubernetes per task. The exact amount we should request is unclear to me
(it largely depends on how many threads actually get spawned for a task).

I wonder if this is being addressed by PR #20553
 written by Yinan. Yinan?

Thanks,
Kimoon

On Thu, Mar 29, 2018 at 5:14 PM, David Vogelbacher <
dvogelbac...@palantir.com> wrote:

> Hi,
>
>
>
> At the moment driver and executor pods are created using the following
> requests and limits:
>
>
>
> *CPU*
>
> *Memory*
>
> *Request*
>
> [driver,executor].cores
>
> [driver,executor].memory
>
> *Limit*
>
> Unlimited (but can be specified using spark.[driver,executor].cores)
>
> [driver,executor].memory + [driver,executor].memoryOverhead
>
>
>
> Specifying the requests like this leads to problems if the pods only get
> the requested amount of resources and nothing of the optional (limit)
> resources, as it can happen in a fully utilized cluster.
>
>
>
> *For memory:*
>
> Let’s say we have a node with 100GiB memory and 5 pods with 20 GiB memory
> and 5 GiB memoryOverhead.
>
> At the beginning all 5 pods use 20 GiB of memory and all is well. If a pod
> then starts using its overhead memory it will get killed as there is no
> more memory available, even though we told spark
>
> that it can use 25 GiB of memory.
>
>
>
> Instead of requesting `[driver,executor].memory`, we should just request
> `[driver,executor].memory + [driver,executor].memoryOverhead `.
>
> I think this case is a bit clearer than the CPU case, so I went ahead and
> filed an issue  with
> more details and made a PR .
>
>
>
> *For CPU:*
>
> As it turns out, there can be performance problems if we only have
> `executor.cores` available (which means we have one core per task). This
> was raised here 
> and is the reason that the cpu limit was set to unlimited.
>
> This issue stems from the fact that in general there will be more than one
> thread per task, resulting in performance impacts if there is only one core
> available.
>
> However, I am not sure that just setting the limit to unlimited is the
> best solution because it means that even if the Kubernetes cluster can
> perfectly satisfy the resource requests, performance might be very bad.
>
>
>
> I think we should guarantee that an executor is able to do its work well
> (without performance issues or getting killed - as could happen in the
> memory case) with the resources it gets guaranteed from Kubernetes.
>
>
>
> One way to solve this could be to request more than 1 core from Kubernetes
> per task. The exact amount we should request is unclear to me (it largely
> depends on how many threads actually get spawned for a task).
>
> We would need to find a way to determine this somehow automatically or at
> least come up with a better default value than 1 core per task.
>
>
>
> Does somebody have ideas or thoughts on how to solve this best?
>
>
>
> Best,
>
> David
>


Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

2018-03-30 Thread Matt Cheah
The question is more so generally what an advised best practice is for setting 
CPU limits. It’s not immediately clear what a correct value is for setting CPU 
limits if one wants to provide guarantees for consistent / guaranteed execution 
performance while also not degrading performance. Additionally, there’s a 
question of if there exists a sane default CPU limit in the Spark pod creation 
code. Such a default seems difficult to set because the JVM can spawn as many 
threads as it likes and a single executor can end up thrashing in between its 
own threads as they contend for the smaller CPU share that is available.

 

From: Yinan Li 
Date: Thursday, March 29, 2018 at 11:08 PM
To: David Vogelbacher 
Cc: "dev@spark.apache.org" 
Subject: Re: [Kubernetes] Resource requests and limits for Driver and Executor 
Pods

 

Hi David, 

 

Regarding cpu limit, in Spark 2.3, we do have the following config properties 
to specify cpu limit for the driver and executors. See 
http://spark.apache.org/docs/latest/running-on-kubernetes.html 
[spark.apache.org].

 

spark.kubernetes.driver.limit.cores

spark.kubernetes.executor.limit.cores

 

On Thu, Mar 29, 2018 at 5:14 PM, David Vogelbacher  
wrote:

Hi,

 

At the moment driver and executor pods are created using the following requests 
and limits:

 CPUMemory
Request[driver,executor].cores[driver,executor].memory
LimitUnlimited (but can be specified using 
spark.[driver,executor].cores)[driver,executor].memory + 
[driver,executor].memoryOverhead

 

Specifying the requests like this leads to problems if the pods only get the 
requested amount of resources and nothing of the optional (limit) resources, as 
it can happen in a fully utilized cluster.

 

For memory:

Let’s say we have a node with 100GiB memory and 5 pods with 20 GiB memory and 5 
GiB memoryOverhead. 

At the beginning all 5 pods use 20 GiB of memory and all is well. If a pod then 
starts using its overhead memory it will get killed as there is no more memory 
available, even though we told spark

that it can use 25 GiB of memory.

 

Instead of requesting `[driver,executor].memory`, we should just request 
`[driver,executor].memory + [driver,executor].memoryOverhead `.

I think this case is a bit clearer than the CPU case, so I went ahead and filed 
an issue [issues.apache.org] with more details and made a PR [github.com].

 

For CPU:

As it turns out, there can be performance problems if we only have 
`executor.cores` available (which means we have one core per task). This was 
raised here [github.com] and is the reason that the cpu limit was set to 
unlimited.

This issue stems from the fact that in general there will be more than one 
thread per task, resulting in performance impacts if there is only one core 
available.

However, I am not sure that just setting the limit to unlimited is the best 
solution because it means that even if the Kubernetes cluster can perfectly 
satisfy the resource requests, performance might be very bad.

 

I think we should guarantee that an executor is able to do its work well 
(without performance issues or getting killed - as could happen in the memory 
case) with the resources it gets guaranteed from Kubernetes.

 

One way to solve this could be to request more than 1 core from Kubernetes per 
task. The exact amount we should request is unclear to me (it largely depends 
on how many threads actually get spawned for a task). 

We would need to find a way to determine this somehow automatically or at least 
come up with a better default value than 1 core per task.

 

Does somebody have ideas or thoughts on how to solve this best?

 

Best,

David

 



smime.p7s
Description: S/MIME cryptographic signature


Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

2018-03-29 Thread Yinan Li
Hi David,

Regarding cpu limit, in Spark 2.3, we do have the following config
properties to specify cpu limit for the driver and executors. See
http://spark.apache.org/docs/latest/running-on-kubernetes.html.

spark.kubernetes.driver.limit.cores
spark.kubernetes.executor.limit.cores

On Thu, Mar 29, 2018 at 5:14 PM, David Vogelbacher <
dvogelbac...@palantir.com> wrote:

> Hi,
>
>
>
> At the moment driver and executor pods are created using the following
> requests and limits:
>
>
>
> *CPU*
>
> *Memory*
>
> *Request*
>
> [driver,executor].cores
>
> [driver,executor].memory
>
> *Limit*
>
> Unlimited (but can be specified using spark.[driver,executor].cores)
>
> [driver,executor].memory + [driver,executor].memoryOverhead
>
>
>
> Specifying the requests like this leads to problems if the pods only get
> the requested amount of resources and nothing of the optional (limit)
> resources, as it can happen in a fully utilized cluster.
>
>
>
> *For memory:*
>
> Let’s say we have a node with 100GiB memory and 5 pods with 20 GiB memory
> and 5 GiB memoryOverhead.
>
> At the beginning all 5 pods use 20 GiB of memory and all is well. If a pod
> then starts using its overhead memory it will get killed as there is no
> more memory available, even though we told spark
>
> that it can use 25 GiB of memory.
>
>
>
> Instead of requesting `[driver,executor].memory`, we should just request
> `[driver,executor].memory + [driver,executor].memoryOverhead `.
>
> I think this case is a bit clearer than the CPU case, so I went ahead and
> filed an issue  with
> more details and made a PR .
>
>
>
> *For CPU:*
>
> As it turns out, there can be performance problems if we only have
> `executor.cores` available (which means we have one core per task). This
> was raised here 
> and is the reason that the cpu limit was set to unlimited.
>
> This issue stems from the fact that in general there will be more than one
> thread per task, resulting in performance impacts if there is only one core
> available.
>
> However, I am not sure that just setting the limit to unlimited is the
> best solution because it means that even if the Kubernetes cluster can
> perfectly satisfy the resource requests, performance might be very bad.
>
>
>
> I think we should guarantee that an executor is able to do its work well
> (without performance issues or getting killed - as could happen in the
> memory case) with the resources it gets guaranteed from Kubernetes.
>
>
>
> One way to solve this could be to request more than 1 core from Kubernetes
> per task. The exact amount we should request is unclear to me (it largely
> depends on how many threads actually get spawned for a task).
>
> We would need to find a way to determine this somehow automatically or at
> least come up with a better default value than 1 core per task.
>
>
>
> Does somebody have ideas or thoughts on how to solve this best?
>
>
>
> Best,
>
> David
>