[ 
https://issues.apache.org/jira/browse/SPARK-34389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281534#comment-17281534
 ] 

Ranju commented on SPARK-34389:
-------------------------------

Yes I understood this why there is no retry logic and thanks for this 
explanation and can close the issue.

Can you guide , to mitigate the indefinite waiting time for executors, is it 
possible to get the available resources of the cluster and match it with the 
required executor resources and if it satisfies then submits the job.

> Spark job on Kubernetes scheduled For Zero or less than minimum number of 
> executors and Wait indefinitely under resource starvation
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-34389
>                 URL: https://issues.apache.org/jira/browse/SPARK-34389
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes
>    Affects Versions: 3.0.1
>            Reporter: Ranju
>            Priority: Major
>         Attachments: DriverLogs_ExecutorLaunchedLessThanMinExecutor.txt, 
> Steps to reproduce.docx
>
>
> In case Cluster does not have sufficient resource (CPU/ Memory ) for minimum 
> number of executors , the executors goes in Pending State for indefinite time 
> until the resource gets free.
> Suppose, Cluster Configurations are:
> total Memory=204Gi
> used Memory=200Gi
> free memory= 4Gi
> SPARK.EXECUTOR.MEMORY=10G
> SPARK.DYNAMICALLOCTION.MINEXECUTORS=4
> SPARK.DYNAMICALLOCATION.MAXEXECUTORS=8
> Rather, the job should be cancelled if requested number of minimum executors 
> are not availableĀ at that point of time because of resource unavailability.
> Currently it is doing partial scheduling or no scheduling and waiting 
> indefinitely. And the job got stuck.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to