[ 
https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289689#comment-16289689
 ] 

Xuefu Zhang commented on SPARK-22683:
-------------------------------------

[~tgraves], I can speak on our use case, where same queries  running on MR vs 
Spark via Hive. Because Spark gets rid of the intermediate HDFS reads/writes of 
MR, we expected better efficiency in addition to perf gains. While our 
expectation is met for some of our queries, usually long running ones with many 
stages, for the resource usage is much worse for other queries, especially 
those short running ones. 

I believe that efficiency can be substantially enhanced in both cases.

> DynamicAllocation wastes resources by allocating containers that will barely 
> be used
> ------------------------------------------------------------------------------------
>
>                 Key: SPARK-22683
>                 URL: https://issues.apache.org/jira/browse/SPARK-22683
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.1.0, 2.2.0
>            Reporter: Julien Cuquemelle
>              Labels: pull-request-available
>
> let's say an executor has spark.executor.cores / spark.task.cpus taskSlots
> The current dynamic allocation policy allocates enough executors
> to have each taskSlot execute a single task, which minimizes latency, 
> but wastes resources when tasks are small regarding executor allocation
> and idling overhead. 
> By adding the tasksPerExecutorSlot, it is made possible to specify how many 
> tasks
> a single slot should ideally execute to mitigate the overhead of executor
> allocation.
> PR: https://github.com/apache/spark/pull/19881



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to