[spark] branch master updated: [SPARK-30322][DOCS] Add stage level scheduling docs

tgraves Wed, 29 Jul 2020 11:47:29 -0700

This is an automated email from the ASF dual-hosted git repository.

tgraves pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new e926d41  [SPARK-30322][DOCS] Add stage level scheduling docs
e926d41 is described below

commit e926d419d305c9400f6f2426ca3e8d04a9180005
Author: Thomas Graves <tgra...@nvidia.com>
AuthorDate: Wed Jul 29 13:46:28 2020 -0500

    [SPARK-30322][DOCS] Add stage level scheduling docs
    
    ### What changes were proposed in this pull request?
    
    Document the stage level scheduling feature.
    
    ### Why are the changes needed?
    
    Document the stage level scheduling feature.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Documentation.
    
    ### How was this patch tested?
    
    n/a docs only
    
    Closes #29292 from tgravescs/SPARK-30322.
    
    Authored-by: Thomas Graves <tgra...@nvidia.com>
    Signed-off-by: Thomas Graves <tgra...@apache.org>
---
 docs/configuration.md   | 7 +++++++
 docs/running-on-yarn.md | 4 ++++
 2 files changed, 11 insertions(+)

diff --git a/docs/configuration.md b/docs/configuration.md
index abf7610..62799db 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -3028,3 +3028,10 @@ There are configurations available to request resources 
for the driver: <code>sp
 Spark will use the configurations specified to first request containers with 
the corresponding resources from the cluster manager. Once it gets the 
container, Spark launches an Executor in that container which will discover 
what resources the container has and the addresses associated with each 
resource. The Executor will register with the Driver and report back the 
resources available to that Executor. The Spark scheduler can then schedule 
tasks to each Executor and assign specific reso [...]
 
 See your cluster manager specific page for requirements and details on each of 
- [YARN](running-on-yarn.html#resource-allocation-and-configuration-overview), 
[Kubernetes](running-on-kubernetes.html#resource-allocation-and-configuration-overview)
 and [Standalone 
Mode](spark-standalone.html#resource-allocation-and-configuration-overview). It 
is currently not available with Mesos or local mode. And please also note that 
local-cluster mode with multiple workers is not supported(see Standalon [...]
+
+# Stage Level Scheduling Overview
+
+The stage level scheduling feature allows users to specify task and executor 
resource requirements at the stage level. This allows for different stages to 
run with executors that have different resources. A prime example of this is 
one ETL stage runs with executors with just CPUs, the next stage is an ML stage 
that needs GPUs. Stage level scheduling allows for user to request different 
executors that have GPUs when the ML stage runs rather then having to acquire 
executors with GPUs at th [...]
+This is only available for the RDD API in Scala, Java, and Python and requires 
dynamic allocation to be enabled.  It is only available on YARN at this time. 
See the [YARN](running-on-yarn.html#stage-level-scheduling-overview) page for 
more implementation details.
+
+See the `RDD.withResources` and `ResourceProfileBuilder` API's for using this 
feature. The current implementation acquires new executors for each 
`ResourceProfile`  created and currently has to be an exact match. Spark does 
not try to fit tasks into an executor that require a different ResourceProfile 
than the executor was created with. Executors that are not in use will idle 
timeout with the dynamic allocation logic. The default configuration for this 
feature is to only allow one Resour [...]
diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md
index 36d8f0b..6f7aaf2b 100644
--- a/docs/running-on-yarn.md
+++ b/docs/running-on-yarn.md
@@ -641,6 +641,10 @@ If the user has a user defined YARN resource, lets call it 
`acceleratorX` then t
 
 YARN does not tell Spark the addresses of the resources allocated to each 
container. For that reason, the user must specify a discovery script that gets 
run by the executor on startup to discover what resources are available to that 
executor. You can find an example scripts in 
`examples/src/main/scripts/getGpusResources.sh`. The script must have execute 
permissions set and the user should setup permissions to not allow malicious 
users to modify it. The script should write to STDOUT a JSO [...]
 
+# Stage Level Scheduling Overview
+
+Stage level scheduling is supported on YARN when dynamic allocation is 
enabled. One thing to note that is YARN specific is that each ResourceProfile 
requires a different container priority on YARN. The mapping is simply the 
ResourceProfile id becomes the priority, on YARN lower numbers are higher 
priority. This means that profiles created earlier will have a higher priority 
in YARN. Normally this won't matter as Spark finishes one stage before starting 
another one, the only case this mig [...]
+
 # Important notes
 
 - Whether core requests are honored in scheduling decisions depends on which 
scheduler is in use and how it is configured.


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-30322][DOCS] Add stage level scheduling docs

Reply via email to