[ 
https://issues.apache.org/jira/browse/SPARK-32067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Yu updated SPARK-32067:
-----------------------------
    Summary: [K8S] Executor pod template config map of ongoing submission got 
inadvertently altered by subsequent submission  (was: [K8S] Executor pod 
template of ongoing submission got inadvertently altered by subsequent 
submission)

> [K8S] Executor pod template config map of ongoing submission got 
> inadvertently altered by subsequent submission
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-32067
>                 URL: https://issues.apache.org/jira/browse/SPARK-32067
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes
>    Affects Versions: 2.4.6, 3.0.0
>            Reporter: James Yu
>            Priority: Minor
>
> THE BUG:
> The bug is reproducible by spark-submit two different apps (app1 and app2) 
> with different executor pod templates (e.g., different labels) to K8s 
> sequentially, and with app2 launching while app1 is still ramping up all its 
> executor pods. The unwanted result is that some launched executor pods of 
> app1 end up having app2's executor pod template applied to them.
> The root cause appears to be that app1's podspec-configmap got overwritten by 
> app2 during the overlapping launching periods because the configmap names of 
> the two apps are the same. This causes some app1's executor pods being ramped 
> up after app2 is launched to be inadvertently launched with the app2's pod 
> template. The issue can be seen as follows:
> First, after submitting app1, you get these configmaps:
> {code:java}
> NAMESPACE    NAME                                       DATA    AGE
> default      app1-1111111111111111-driver-conf-map      1       9m46s
> default      podspec-configmap                          1       12m{code}
> Then submit app2 while app1 is still ramping up its executors. The 
> podspec-confimap is modified by app2.
> {code:java}
> NAMESPACE    NAME                                       DATA    AGE
> default      app1-1111111111111111-driver-conf-map      1       11m43s
> default      app2-2222222222222222-driver-conf-map      1       10s
> default      podspec-configmap                          1       13m57s{code}
>  
> PROPOSED SOLUTION:
> Properly prefix the podspec-configmap for each submitted app.
> {code:java}
> NAMESPACE    NAME                                       DATA    AGE
> default      app1-1111111111111111-driver-conf-map      1       11m43s
> default      app1-1111111111111111-podspec-configmap    1       13m57s
> default      app2-2222222222222222-driver-conf-map      1       10s 
> default      app2-2222222222222222-podspec-configmap    1       3m{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to