[ 
https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16050016#comment-16050016
 ] 

Saisai Shao commented on SPARK-21082:
-------------------------------------

That's fine if the storage memory is not enough to cache all the data, Spark 
still could handle this scenario without OOM. Base on the free memory to 
schedule the task is too scenario specific from my understanding.

[~tgraves] [~irashid] [~mridulm80] may have more thoughts on it. 

> Consider Executor's memory usage when scheduling task 
> ------------------------------------------------------
>
>                 Key: SPARK-21082
>                 URL: https://issues.apache.org/jira/browse/SPARK-21082
>             Project: Spark
>          Issue Type: Improvement
>          Components: Scheduler, Spark Core
>    Affects Versions: 2.3.0
>            Reporter: DjvuLee
>
>  Spark Scheduler do not consider the memory usage during dispatch tasks, this 
> can lead to Executor OOM if the RDD is cached sometimes, because Spark can 
> not estimate the memory usage well enough(especially when the RDD type is not 
> flatten), scheduler may dispatch so many tasks on one Executor.
> We can offer a configuration for user to decide whether scheduler will 
> consider the memory usage.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to