[Spark][Core] Resource Allocation

2022-07-12 Thread Amin Borjian
I have some problems that I am looking for if there is no solution for them 
(due to the current implementation) or if there is a way and I was not aware of 
it.

1)

Currently, we can enable and configure dynamic resource allocation based on 
below documentation.
https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation

Based on documentation, it is possible to use an initial value of executors at 
first, and if some tasks are idle, use more executors. Also, if some executors 
were idle and we didn't have more tasks, executors will be killed (to be used 
by others). My question is for when we have 2 SparkContext (Separate 
Applications). In such cases, I expect the dynamic method to work as fairly as 
possible and distribute resources equally. But what I observe is that if 
SparkContext 1 uses all of the executors due to having running tasks, it will 
not release them until it has no more tasks to run and executors become idle. 
While Spark could avoid executing the new tasks of the SparkContext 1 (because 
it is not logical to kill the running tasks) and instead make executors free 
for SparkContext 2, it didn't do so. I do not found any configuration for it. 
Have I understood correctly? And is there no way to achieve a fair dynamic 
allocation between contexts?

2)

In dynamic or even static resource allocation, Spark must run a series of 
executors from among the resources in the cluster (workers). The data that 
exists on the cluster has as little skew and is distributed throughout the 
cluster. For this reason, it is better for executors to be distributed as much 
as possible at the cluster in order to benefit from the data locality. But what 
I observe is that Spark sometimes executes 2 or more executors on a same worker 
even if there are some idle workers. Is this intentional and there are other 
reasons for improvement, or is it a better way and not currently supported by 
Spark?


Re: [Spark][Core] Resource Allocation

2022-07-15 Thread Sungwoo Park
For 1), this is a recurring question in this mailing list, and the answer
is: no, Spark does not support the coordination between multiple Spark
applications. Spark relies on an external resource manager, such as Yarn
and Kubernetes, to allocate resources to multiple Spark applications. For
example, to achieve a fair allocation of resources on Yarn, one should
configure Yarn Fair Scheduler.

Databricks seems to have their own solution to this problem (with the
multi-cluster optimization option). For Apache Spark, there is an extension
called Spark-MR3 which can manage resources among multiple Spark
applications. If you are interested, see the blog article:
https://www.datamonad.com/post/2021-08-18-spark-mr3/
>From the blog:

*The main motivation for developing Spark on MR3 is to allow multiple Spark
applications to share compute resources such as Yarn containers or
Kubernetes Pods.*

We have released Spark 3.0.3 on MR3, and Spark 3.2.1 on MR3 will be
released sometime soon.
If you are further interested, see the webpage of Spark on MR3:
https://mr3docs.datamonad.com/docs/spark/

--- Sungwoo

On Wed, Jul 13, 2022 at 4:55 AM Amin Borjian 
wrote:

> I have some problems that I am looking for if there is no solution for
> them (due to the current implementation) or if there is a way and I was not
> aware of it.
>
>
>
> 1)
>
>
>
> Currently, we can enable and configure dynamic resource allocation based
> on below documentation.
>
>
> https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation
>
>
>
> Based on documentation, it is possible to use an initial value of
> executors at first, and if some tasks are idle, use more executors. Also,
> if some executors were idle and we didn't have more tasks, executors will
> be killed (to be used by others). My question is for when we have 2
> SparkContext (Separate Applications). In such cases, I expect the dynamic
> method to work as fairly as possible and distribute resources equally. But
> what I observe is that if SparkContext 1 uses all of the executors due to
> having running tasks, it will not release them until it has no more tasks
> to run and executors become idle. While Spark could avoid executing the new
> tasks of the SparkContext 1 (because it is not logical to kill the running
> tasks) and instead make executors free for SparkContext 2, it didn't do so.
> I do not found any configuration for it. Have I understood correctly? And
> is there no way to achieve a fair dynamic allocation between contexts?
>
>
>
> 2)
>
>
>
> In dynamic or even static resource allocation, Spark must run a series of
> executors from among the resources in the cluster (workers). The data that
> exists on the cluster has as little skew and is distributed throughout the
> cluster. For this reason, it is better for executors to be distributed as
> much as possible at the cluster in order to benefit from the data locality.
> But what I observe is that Spark sometimes executes 2 or more executors on
> a same worker even if there are some idle workers. Is this intentional and
> there are other reasons for improvement, or is it a better way and not
> currently supported by Spark?
>