Manikandan R created YUNIKORN-2665:
--------------------------------------

             Summary: Gang app originator pod changes after restart
                 Key: YUNIKORN-2665
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2665
             Project: Apache YuniKorn
          Issue Type: Bug
          Components: shim - kubernetes
    Affects Versions: 1.5.0, 1.4.0, 1.3.0, 1.5.1, 1.5.2
            Reporter: Manikandan R
            Assignee: Manikandan R


Gang app choose the first pod (who created the app) as originator pod which 
becomes the real driver pod later. While processing gang app specifically after 
the placeholder creation and in the process of replacement, restart can lead to 
the below described incorrect behaviour:

During restore, there is no guarantee on the ordering of pods from K8s lister 
especially when all the pods created with the same second timestamp. k8s use 
the seconds based timestamp, which means all pods created with in same second 
has same timestamp. During this situation, which pod comes first from lister, 
YK designate it as originator pod. So, any placeholder could become the 
originator pod and actual originator pod has lost. This change could cause 
rippling effects and needs to be fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org

Reply via email to