[ https://issues.apache.org/jira/browse/YUNIKORN-2665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wilfred Spiegelenburg resolved YUNIKORN-2665. --------------------------------------------- Fix Version/s: 1.6.0 1.5.2 Resolution: Fixed Changes have been committed and backported into the 1.5 branch closing > Gang app originator pod changes after restart > --------------------------------------------- > > Key: YUNIKORN-2665 > URL: https://issues.apache.org/jira/browse/YUNIKORN-2665 > Project: Apache YuniKorn > Issue Type: Bug > Components: shim - kubernetes > Affects Versions: 1.3.0, 1.4.0, 1.5.0, 1.5.1 > Reporter: Manikandan R > Assignee: Manikandan R > Priority: Critical > Labels: pull-request-available > Fix For: 1.6.0, 1.5.2 > > > Gang app choose the first pod (who created the app) as originator pod which > becomes the real driver pod later. While processing gang app specifically > after the placeholder creation and in the process of replacement, restart can > lead to the below described incorrect behaviour: > During restore, there is no guarantee on the ordering of pods coming from K8s > lister especially when all the pods created with the same second timestamp. > k8s use the seconds based timestamp, which means all pods created with in > same second has same timestamp. During this situation, whichever pod comes > first from lister, YK designate it as originator pod. So, any placeholder > could become the originator pod and actual originator pod has been lost. This > change could cause rippling effects leading to weird behaviour and needs to > be fixed. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org For additional commands, e-mail: dev-h...@yunikorn.apache.org