[jira] [Commented] (SPARK-21122) Address starvation issues when dynamic allocation is enabled

2017-10-18 Thread Craig Ingram (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209918#comment-16209918
 ] 

Craig Ingram commented on SPARK-21122:
--

Finally getting back around to this. Thanks for the feedback [~tgraves]. I 
agree with pretty much everything you pointed out. Newer versions of YARN do 
have a [Cluster Metrics 
API|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Metrics_API].
 I think this was introduced in 2.8 though. Regardless I believe working with 
YARN's PreemptionMessage is a better path forward than what I was initially 
proposing. I wish there was an elegant way to do this generically, but I 
believe I can at least make an abstraction of the PreemptionMessage that can be 
used by other RM clients. For now, I will focus on a YARN specific solution.

> Address starvation issues when dynamic allocation is enabled
> 
>
> Key: SPARK-21122
> URL: https://issues.apache.org/jira/browse/SPARK-21122
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, YARN
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Craig Ingram
> Attachments: Preventing Starvation with Dynamic Allocation Enabled.pdf
>
>
> When dynamic resource allocation is enabled on a cluster, it’s currently 
> possible for one application to consume all the cluster’s resources, 
> effectively starving any other application trying to start. This is 
> particularly painful in a notebook environment where notebooks may be idle 
> for tens of minutes while the user is figuring out what to do next (or eating 
> their lunch). Ideally the application should give resources back to the 
> cluster when monitoring indicates other applications are pending.
> Before delving into the specifics of the solution. There are some workarounds 
> to this problem that are worth mentioning:
> * Set spark.dynamicAllocation.maxExecutors to a small value, so that users 
> are unlikely to use the entire cluster even when many of them are doing work. 
> This approach will hurt cluster utilization.
> * If using YARN, enable preemption and have each application (or 
> organization) run in a separate queue. The downside of this is that when YARN 
> preempts, it doesn't know anything about which executor it's killing. It 
> would just as likely kill a long running executor with cached data as one 
> that just spun up. Moreover, given a feature like 
> https://issues.apache.org/jira/browse/SPARK-21097 (preserving cached data on 
> executor decommission), YARN may not wait long enough between trying to 
> gracefully and forcefully shut down the executor. This would mean the blocks 
> that belonged to that executor would be lost and have to be recomputed.
> * Configure YARN to use the capacity scheduler with multiple scheduler 
> queues. Put high-priority notebook users into a high-priority queue. Prevents 
> high-priority users from being starved out by low-priority notebook users. 
> Does not prevent users in the same priority class from starving each other.
> Obviously any solution to this problem that depends on YARN would leave other 
> resource managers out in the cold. The solution proposed in this ticket will 
> afford spark clusters the flexibly to hook in different resource allocation 
> policies to fulfill their user's needs regardless of resource manager choice. 
> Initially the focus will be on users in a notebook environment. When 
> operating in a notebook environment with many users, the goal is fair 
> resource allocation. Given that all users will be using the same memory 
> configuration, this solution will focus primarily on fair sharing of cores.
> The fair resource allocation policy should pick executors to remove based on 
> three factors initially: idleness, presence of cached data, and uptime. The 
> policy will favor removing executors that are idle, short-lived, and have no 
> cached data. The policy will only preemptively remove executors if there are 
> pending applications or cores (otherwise the default dynamic allocation 
> timeout/removal process is followed). The policy will also allow an 
> application's resource consumption to expand based on cluster utilization. 
> For example if there are 3 applications running but 2 of them are idle, the 
> policy will allow a busy application with pending tasks to consume more than 
> 1/3rd of the the cluster's resources.
> More complexity could be added to take advantage of task/stage metrics, 
> histograms, and heuristics (i.e. favor removing executors running tasks that 
> are quick). The important thing here is to benchmark effectively before 
> adding complexity so we can measure the impact of the changes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-

[jira] [Comment Edited] (SPARK-21122) Address starvation issues when dynamic allocation is enabled

2017-06-20 Thread Craig Ingram (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16056399#comment-16056399
 ] 

Craig Ingram edited comment on SPARK-21122 at 6/20/17 8:35 PM:
---

Attached initial high-level design document.

Again, thanks for the feedback Sean. 

I haven't looked into using Mesos yet; but at least with Spark on YARN, you 
have to configure dynamic allocation in Spark and YARN's 
yarn-site.conf/yarn-env.sh to hook-in the shuffle service.

I agree that applications shouldn't know details about other applications, but 
high-level metrics like the number of applications running/pending and pending 
requests for resources are available already through the YARN's client.

Hopefully the attached doc will help clarify the proposed changes to address 
this issue. I also want to emphasize that I'm totally open to hearing other 
ways to address this issue. Thanks!


was (Author: craigi):
Initial high-level design document.

> Address starvation issues when dynamic allocation is enabled
> 
>
> Key: SPARK-21122
> URL: https://issues.apache.org/jira/browse/SPARK-21122
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, YARN
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Craig Ingram
> Attachments: Preventing Starvation with Dynamic Allocation Enabled.pdf
>
>
> When dynamic resource allocation is enabled on a cluster, it’s currently 
> possible for one application to consume all the cluster’s resources, 
> effectively starving any other application trying to start. This is 
> particularly painful in a notebook environment where notebooks may be idle 
> for tens of minutes while the user is figuring out what to do next (or eating 
> their lunch). Ideally the application should give resources back to the 
> cluster when monitoring indicates other applications are pending.
> Before delving into the specifics of the solution. There are some workarounds 
> to this problem that are worth mentioning:
> * Set spark.dynamicAllocation.maxExecutors to a small value, so that users 
> are unlikely to use the entire cluster even when many of them are doing work. 
> This approach will hurt cluster utilization.
> * If using YARN, enable preemption and have each application (or 
> organization) run in a separate queue. The downside of this is that when YARN 
> preempts, it doesn't know anything about which executor it's killing. It 
> would just as likely kill a long running executor with cached data as one 
> that just spun up. Moreover, given a feature like 
> https://issues.apache.org/jira/browse/SPARK-21097 (preserving cached data on 
> executor decommission), YARN may not wait long enough between trying to 
> gracefully and forcefully shut down the executor. This would mean the blocks 
> that belonged to that executor would be lost and have to be recomputed.
> * Configure YARN to use the capacity scheduler with multiple scheduler 
> queues. Put high-priority notebook users into a high-priority queue. Prevents 
> high-priority users from being starved out by low-priority notebook users. 
> Does not prevent users in the same priority class from starving each other.
> Obviously any solution to this problem that depends on YARN would leave other 
> resource managers out in the cold. The solution proposed in this ticket will 
> afford spark clusters the flexibly to hook in different resource allocation 
> policies to fulfill their user's needs regardless of resource manager choice. 
> Initially the focus will be on users in a notebook environment. When 
> operating in a notebook environment with many users, the goal is fair 
> resource allocation. Given that all users will be using the same memory 
> configuration, this solution will focus primarily on fair sharing of cores.
> The fair resource allocation policy should pick executors to remove based on 
> three factors initially: idleness, presence of cached data, and uptime. The 
> policy will favor removing executors that are idle, short-lived, and have no 
> cached data. The policy will only preemptively remove executors if there are 
> pending applications or cores (otherwise the default dynamic allocation 
> timeout/removal process is followed). The policy will also allow an 
> application's resource consumption to expand based on cluster utilization. 
> For example if there are 3 applications running but 2 of them are idle, the 
> policy will allow a busy application with pending tasks to consume more than 
> 1/3rd of the the cluster's resources.
> More complexity could be added to take advantage of task/stage metrics, 
> histograms, and heuristics (i.e. favor removing executors running tasks that 
> are quick). The important thing here is to benchmark effectively before 
> adding compl

[jira] [Updated] (SPARK-21122) Address starvation issues when dynamic allocation is enabled

2017-06-20 Thread Craig Ingram (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Ingram updated SPARK-21122:
-
Attachment: Preventing Starvation with Dynamic Allocation Enabled.pdf

Initial high-level design document.

> Address starvation issues when dynamic allocation is enabled
> 
>
> Key: SPARK-21122
> URL: https://issues.apache.org/jira/browse/SPARK-21122
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, YARN
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Craig Ingram
> Attachments: Preventing Starvation with Dynamic Allocation Enabled.pdf
>
>
> When dynamic resource allocation is enabled on a cluster, it’s currently 
> possible for one application to consume all the cluster’s resources, 
> effectively starving any other application trying to start. This is 
> particularly painful in a notebook environment where notebooks may be idle 
> for tens of minutes while the user is figuring out what to do next (or eating 
> their lunch). Ideally the application should give resources back to the 
> cluster when monitoring indicates other applications are pending.
> Before delving into the specifics of the solution. There are some workarounds 
> to this problem that are worth mentioning:
> * Set spark.dynamicAllocation.maxExecutors to a small value, so that users 
> are unlikely to use the entire cluster even when many of them are doing work. 
> This approach will hurt cluster utilization.
> * If using YARN, enable preemption and have each application (or 
> organization) run in a separate queue. The downside of this is that when YARN 
> preempts, it doesn't know anything about which executor it's killing. It 
> would just as likely kill a long running executor with cached data as one 
> that just spun up. Moreover, given a feature like 
> https://issues.apache.org/jira/browse/SPARK-21097 (preserving cached data on 
> executor decommission), YARN may not wait long enough between trying to 
> gracefully and forcefully shut down the executor. This would mean the blocks 
> that belonged to that executor would be lost and have to be recomputed.
> * Configure YARN to use the capacity scheduler with multiple scheduler 
> queues. Put high-priority notebook users into a high-priority queue. Prevents 
> high-priority users from being starved out by low-priority notebook users. 
> Does not prevent users in the same priority class from starving each other.
> Obviously any solution to this problem that depends on YARN would leave other 
> resource managers out in the cold. The solution proposed in this ticket will 
> afford spark clusters the flexibly to hook in different resource allocation 
> policies to fulfill their user's needs regardless of resource manager choice. 
> Initially the focus will be on users in a notebook environment. When 
> operating in a notebook environment with many users, the goal is fair 
> resource allocation. Given that all users will be using the same memory 
> configuration, this solution will focus primarily on fair sharing of cores.
> The fair resource allocation policy should pick executors to remove based on 
> three factors initially: idleness, presence of cached data, and uptime. The 
> policy will favor removing executors that are idle, short-lived, and have no 
> cached data. The policy will only preemptively remove executors if there are 
> pending applications or cores (otherwise the default dynamic allocation 
> timeout/removal process is followed). The policy will also allow an 
> application's resource consumption to expand based on cluster utilization. 
> For example if there are 3 applications running but 2 of them are idle, the 
> policy will allow a busy application with pending tasks to consume more than 
> 1/3rd of the the cluster's resources.
> More complexity could be added to take advantage of task/stage metrics, 
> histograms, and heuristics (i.e. favor removing executors running tasks that 
> are quick). The important thing here is to benchmark effectively before 
> adding complexity so we can measure the impact of the changes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21122) Address starvation issues when dynamic allocation is enabled

2017-06-16 Thread Craig Ingram (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052362#comment-16052362
 ] 

Craig Ingram commented on SPARK-21122:
--

Thanks for the feedback, Sean. This ticket simply represents one of the issues 
brought up in [SPARK-21084|https://issues.apache.org/jira/browse/SPARK-21084]. 
Since this topic covers a lot of ground, we wanted to create a separate ticket 
for the discussion and work. We are definitely open to alternate approaches as 
well.

I agree that this is something the resource managers should manage, but I don't 
see a way control deallocation at a reasonable level from within a resource 
manager. Even without these changes, you still have to configure things in two 
places. I also don't believe you would want to enable something like YARN's 
preemption while using this feature as the two mechanisms would end up fighting 
each other.

At the very least, the existing policy needs some additional logic to 
preemptively decommission executors when other applications are being starved. 
Couple this with 
[SPARK-21097|https://issues.apache.org/jira/browse/SPARK-21097] and even 
executors with cached data can safely be removed without a huge performance 
impact. I think the improvement here is maximizing cluster utilization while 
also allowing other applications to start and fairly consume resources.

> Address starvation issues when dynamic allocation is enabled
> 
>
> Key: SPARK-21122
> URL: https://issues.apache.org/jira/browse/SPARK-21122
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, YARN
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Craig Ingram
>
> When dynamic resource allocation is enabled on a cluster, it’s currently 
> possible for one application to consume all the cluster’s resources, 
> effectively starving any other application trying to start. This is 
> particularly painful in a notebook environment where notebooks may be idle 
> for tens of minutes while the user is figuring out what to do next (or eating 
> their lunch). Ideally the application should give resources back to the 
> cluster when monitoring indicates other applications are pending.
> Before delving into the specifics of the solution. There are some workarounds 
> to this problem that are worth mentioning:
> * Set spark.dynamicAllocation.maxExecutors to a small value, so that users 
> are unlikely to use the entire cluster even when many of them are doing work. 
> This approach will hurt cluster utilization.
> * If using YARN, enable preemption and have each application (or 
> organization) run in a separate queue. The downside of this is that when YARN 
> preempts, it doesn't know anything about which executor it's killing. It 
> would just as likely kill a long running executor with cached data as one 
> that just spun up. Moreover, given a feature like 
> https://issues.apache.org/jira/browse/SPARK-21097 (preserving cached data on 
> executor decommission), YARN may not wait long enough between trying to 
> gracefully and forcefully shut down the executor. This would mean the blocks 
> that belonged to that executor would be lost and have to be recomputed.
> * Configure YARN to use the capacity scheduler with multiple scheduler 
> queues. Put high-priority notebook users into a high-priority queue. Prevents 
> high-priority users from being starved out by low-priority notebook users. 
> Does not prevent users in the same priority class from starving each other.
> Obviously any solution to this problem that depends on YARN would leave other 
> resource managers out in the cold. The solution proposed in this ticket will 
> afford spark clusters the flexibly to hook in different resource allocation 
> policies to fulfill their user's needs regardless of resource manager choice. 
> Initially the focus will be on users in a notebook environment. When 
> operating in a notebook environment with many users, the goal is fair 
> resource allocation. Given that all users will be using the same memory 
> configuration, this solution will focus primarily on fair sharing of cores.
> The fair resource allocation policy should pick executors to remove based on 
> three factors initially: idleness, presence of cached data, and uptime. The 
> policy will favor removing executors that are idle, short-lived, and have no 
> cached data. The policy will only preemptively remove executors if there are 
> pending applications or cores (otherwise the default dynamic allocation 
> timeout/removal process is followed). The policy will also allow an 
> application's resource consumption to expand based on cluster utilization. 
> For example if there are 3 applications running but 2 of them are idle, the 
> policy will allow a busy application with pending ta

[jira] [Created] (SPARK-21122) Address starvation issues when dynamic allocation is enabled

2017-06-16 Thread Craig Ingram (JIRA)
Craig Ingram created SPARK-21122:


 Summary: Address starvation issues when dynamic allocation is 
enabled
 Key: SPARK-21122
 URL: https://issues.apache.org/jira/browse/SPARK-21122
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core, YARN
Affects Versions: 2.2.0, 2.3.0
Reporter: Craig Ingram


When dynamic resource allocation is enabled on a cluster, it’s currently 
possible for one application to consume all the cluster’s resources, 
effectively starving any other application trying to start. This is 
particularly painful in a notebook environment where notebooks may be idle for 
tens of minutes while the user is figuring out what to do next (or eating their 
lunch). Ideally the application should give resources back to the cluster when 
monitoring indicates other applications are pending.

Before delving into the specifics of the solution. There are some workarounds 
to this problem that are worth mentioning:
* Set spark.dynamicAllocation.maxExecutors to a small value, so that users are 
unlikely to use the entire cluster even when many of them are doing work. This 
approach will hurt cluster utilization.
* If using YARN, enable preemption and have each application (or organization) 
run in a separate queue. The downside of this is that when YARN preempts, it 
doesn't know anything about which executor it's killing. It would just as 
likely kill a long running executor with cached data as one that just spun up. 
Moreover, given a feature like 
https://issues.apache.org/jira/browse/SPARK-21097 (preserving cached data on 
executor decommission), YARN may not wait long enough between trying to 
gracefully and forcefully shut down the executor. This would mean the blocks 
that belonged to that executor would be lost and have to be recomputed.
* Configure YARN to use the capacity scheduler with multiple scheduler queues. 
Put high-priority notebook users into a high-priority queue. Prevents 
high-priority users from being starved out by low-priority notebook users. Does 
not prevent users in the same priority class from starving each other.

Obviously any solution to this problem that depends on YARN would leave other 
resource managers out in the cold. The solution proposed in this ticket will 
afford spark clusters the flexibly to hook in different resource allocation 
policies to fulfill their user's needs regardless of resource manager choice. 
Initially the focus will be on users in a notebook environment. When operating 
in a notebook environment with many users, the goal is fair resource 
allocation. Given that all users will be using the same memory configuration, 
this solution will focus primarily on fair sharing of cores.

The fair resource allocation policy should pick executors to remove based on 
three factors initially: idleness, presence of cached data, and uptime. The 
policy will favor removing executors that are idle, short-lived, and have no 
cached data. The policy will only preemptively remove executors if there are 
pending applications or cores (otherwise the default dynamic allocation 
timeout/removal process is followed). The policy will also allow an 
application's resource consumption to expand based on cluster utilization. For 
example if there are 3 applications running but 2 of them are idle, the policy 
will allow a busy application with pending tasks to consume more than 1/3rd of 
the the cluster's resources.

More complexity could be added to take advantage of task/stage metrics, 
histograms, and heuristics (i.e. favor removing executors running tasks that 
are quick). The important thing here is to benchmark effectively before adding 
complexity so we can measure the impact of the changes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org