[ 
https://issues.apache.org/jira/browse/YUNIKORN-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko resolved YUNIKORN-1998.
------------------------------------
    Fix Version/s: 1.4.0
       Resolution: Fixed

> Stale AdmissionControllerConf was used in e2e test
> --------------------------------------------------
>
>                 Key: YUNIKORN-1998
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1998
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: test - e2e
>            Reporter: Yu-Lin Chen
>            Assignee: Yu-Lin Chen
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 1.4.0
>
>
> In e2e test for "user_group_limit", we updated Yunikorn ConfigMap before 
> submit sleep pods.  
> [https://github.com/apache/yunikorn-k8shim/blob/master/test/e2e/user_group_limit/user_group_limit_test.go#L67-L70]
> However, the AdmissionControllerConf in Admission controller doesn't 
> immediately reflect the changes. The time gap leads following error in CI 
> flow:  
> [https://github.com/apache/yunikorn-k8shim/actions/runs/6282144385/job/17062987755?pr=677#step:5:2346]
> We need to find a way to ensure AdmissionControllerConf has been updated 
> before e2e test submits a new pods.
> This issue can be reproduced if we introduce 1 second delay when the 
> admission controller updates the AdmissionControllerConf. Here is the example 
> code.
> {code:java}
> func (h *configMapUpdateHandler) OnUpdate(_, newObj interface{}) {
>     cm := utils.Convert2ConfigMap(newObj)
>     
>     // sleep 1 to delay AdmissionControllerConf update
>     time.Sleep(1 * time.Second)
>     
>     if idx, ok := h.configMapIndex(cm); ok {
>         h.conf.configUpdated(idx, cm)
>     }
> } {code}
> [https://github.com/apache/yunikorn-k8shim/blob/master/pkg/admission/conf/am_conf.go#L237-L238]
> Below are the before/after AM logs when we added 1 sec delay:
> Success without delay 1 sec:
>  * (AM configMapUpdateHandler)             05:36:39.184Z.  
> AdmissionControllerConf trying to upgrade config
>  * (AM configMapUpdateHandler)             
> {color:#de350b}05:36:39.185Z{color} . AdmissionControllerConf config upgraded
>  * (AM Webhook)                                       05:36:39.218Z.  AM 
> receive AdmissionReview request
>  * (AM Webhook)                                       
> {color:#de350b}05:36:39.221Z{color}.  AM check Pods with new config,  E2E 
> Test Passed 
> Failed with delay 1 sec:
>  * (AM configMapUpdateHandler)            08:19:31.025Z.   
> AdmissionControllerConf trying to upgrade config
>  * (AM Webhook)                                      08:19:31.067Z.   AM 
> receive AdmissionReview request
>  * (AM Webhook)                                      08:19:31.069Z    
> {color:#ff0000}*AM check Pods with stale config*{color}, {color:#ff0000}*E2E 
> Test failed* {color}
>  * (AM configMapUpdateHandler)            08:19:32.026Z.  
> AdmissionControllerConf config upgraded
> In my kind cluster(v1.24.15), there is only 0.036 sec gap 
> ({color:#de350b}05:36:39.185Z~ {color}{color:#de350b}05:36:39.221Z{color}).  
> It's possible that the admission controller in CI flow runs those steps in 
> different order.
> The possible Solution:
> 1. In the e2e test, sleep for 1 seconds between updating the configmap and 
> submitting new sleep pods. (It's quick fix, I assumed the time gap is always 
> less than 1 sec ) 
> 2. In the e2e test, check current AdmissionControllerConf's value before 
> submit a new pod. (How do client dump current AdmissionControllerConf? Need 
> to seek for more advice.)
> Since this issue only impact e2e test for now, we can go with solution #1 as 
> a quick fix.  But it'll be better if we allow client to check whether 
> AdmissionControllerConf is up-to-date.
> Please kindly let me know if I have any misunderstandings. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org

Reply via email to