[jira] [Updated] (SPARK-42260) Log when the K8s Exec Pods Allocator Stalls

2023-08-09 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-42260:
-
Target Version/s: 4.0.0

> Log when the K8s Exec Pods Allocator Stalls
> ---
>
> Key: SPARK-42260
> URL: https://issues.apache.org/jira/browse/SPARK-42260
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 3.4.0, 3.4.1
>Reporter: Holden Karau
>Assignee: Holden Karau
>Priority: Minor
>
> Sometimes if the K8s APIs are being slow the ExecutorPods allocator can stall 
> and it would be good for us to log this (and how long we've stalled for) so 
> folks can tell more clearly why Spark is unable to reach the desired target 
> number of executors.
>  
> This is _somewhat_ related to SPARK-36664 which logs the time spent waiting 
> for executor allocation but goes a step further for K8s and logs when we've 
> stalled because we have too many pending pods.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42260) Log when the K8s Exec Pods Allocator Stalls

2023-06-27 Thread Yuming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-42260:

Target Version/s:   (was: 3.4.1)

> Log when the K8s Exec Pods Allocator Stalls
> ---
>
> Key: SPARK-42260
> URL: https://issues.apache.org/jira/browse/SPARK-42260
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 3.4.0, 3.4.1
>Reporter: Holden Karau
>Assignee: Holden Karau
>Priority: Minor
>
> Sometimes if the K8s APIs are being slow the ExecutorPods allocator can stall 
> and it would be good for us to log this (and how long we've stalled for) so 
> folks can tell more clearly why Spark is unable to reach the desired target 
> number of executors.
>  
> This is _somewhat_ related to SPARK-36664 which logs the time spent waiting 
> for executor allocation but goes a step further for K8s and logs when we've 
> stalled because we have too many pending pods.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org