[ https://issues.apache.org/jira/browse/SPARK-32067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206082#comment-17206082 ]
Stijn De Haes commented on SPARK-32067: --------------------------------------- Also ran into this today. This actually makes the pod template feature useless for me. I'll try to make a PR next week to do the proposed solution :) > [K8S] Executor pod template ConfigMap of ongoing submission got inadvertently > altered by subsequent submission > -------------------------------------------------------------------------------------------------------------- > > Key: SPARK-32067 > URL: https://issues.apache.org/jira/browse/SPARK-32067 > Project: Spark > Issue Type: Bug > Components: Kubernetes > Affects Versions: 2.4.7, 3.0.1 > Reporter: James Yu > Priority: Minor > > THE BUG: > The bug is reproducible by spark-submit two different apps (app1 and app2) > with different executor pod templates (e.g., different labels) to K8s > sequentially, with app2 launching while app1 is still in the middle of > ramping up all its executor pods. The unwanted result is that some launched > executor pods of app1 end up having app2's executor pod template applied to > them. > The root cause appears to be that app1's podspec-configmap got overwritten by > app2 during the overlapping launching periods because both apps use the same > ConfigMap (name). This causes some app1's executor pods being ramped up after > app2 is launched to be inadvertently launched with the app2's pod template. > The issue can be seen as follows: > First, after submitting app1, you get these configmaps: > {code:java} > NAMESPACE NAME DATA AGE > default app1-1111111111111111-driver-conf-map 1 9m46s > default podspec-configmap 1 12m{code} > Then submit app2 while app1 is still ramping up its executors. The > podspec-confimap is modified by app2. > {code:java} > NAMESPACE NAME DATA AGE > default app1-1111111111111111-driver-conf-map 1 11m43s > default app2-2222222222222222-driver-conf-map 1 10s > default podspec-configmap 1 13m57s{code} > > PROPOSED SOLUTION: > Properly prefix the podspec-configmap for each submitted app, ideally the > same way as the driver configmap: > {code:java} > NAMESPACE NAME DATA AGE > default app1-1111111111111111-driver-conf-map 1 11m43s > default app1-1111111111111111-podspec-configmap 1 13m57s > default app2-2222222222222222-driver-conf-map 1 10s > default app2-2222222222222222-podspec-configmap 1 3m{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org