[ https://issues.apache.org/jira/browse/SPARK-32067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208372#comment-17208372 ]
Dongjoon Hyun commented on SPARK-32067: --------------------------------------- [~james...@ymail.com]. It's used when `master` branch is affected to distinguish the following. - `Affected Version = the version of master` means this bug exists in `master` branch or this new feature is added in `master` branch for next release. - `Affected Version = 3.0.2 and not 3.1.0` means this bug exists only at `branch-3.0`. In master branch, this doesn't exist due to the another improvement or fixes. > Use unique ConfigMap name for executor pod template > --------------------------------------------------- > > Key: SPARK-32067 > URL: https://issues.apache.org/jira/browse/SPARK-32067 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes > Affects Versions: 2.4.7, 3.0.1, 3.1.0 > Reporter: James Yu > Priority: Major > > THE BUG: > The bug is reproducible by spark-submit two different apps (app1 and app2) > with different executor pod templates (e.g., different labels) to K8s > sequentially, with app2 launching while app1 is still in the middle of > ramping up all its executor pods. The unwanted result is that some launched > executor pods of app1 end up having app2's executor pod template applied to > them. > The root cause appears to be that app1's podspec-configmap got overwritten by > app2 during the overlapping launching periods because both apps use the same > ConfigMap (name). This causes some app1's executor pods being ramped up after > app2 is launched to be inadvertently launched with the app2's pod template. > The issue can be seen as follows: > First, after submitting app1, you get these configmaps: > {code:java} > NAMESPACE NAME DATA AGE > default app1-1111111111111111-driver-conf-map 1 9m46s > default podspec-configmap 1 12m{code} > Then submit app2 while app1 is still ramping up its executors. The > podspec-confimap is modified by app2. > {code:java} > NAMESPACE NAME DATA AGE > default app1-1111111111111111-driver-conf-map 1 11m43s > default app2-2222222222222222-driver-conf-map 1 10s > default podspec-configmap 1 13m57s{code} > > PROPOSED SOLUTION: > Properly prefix the podspec-configmap for each submitted app, ideally the > same way as the driver configmap: > {code:java} > NAMESPACE NAME DATA AGE > default app1-1111111111111111-driver-conf-map 1 11m43s > default app1-1111111111111111-podspec-configmap 1 13m57s > default app2-2222222222222222-driver-conf-map 1 10s > default app2-2222222222222222-podspec-configmap 1 3m{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org