[I] bug: APISIX Ingress Controller fails to sync endpoint changes when pods restart (affects 2.0.0-rc3 and 2.0.0-rc4) [apisix-ingress-controller]

via GitHub Mon, 01 Sep 2025 01:55:41 -0700


shlok-srivastava opened a new issue, #2536:
URL: https://github.com/apache/apisix-ingress-controller/issues/2536


   ### Current Behavior
   
     Problem Summary
   
     The APISIX ingress controller does not automatically update upstream 
configurations when Kubernetes pods restart and get new IP addresses, causing 
504 Gateway Timeout errors until manually fixed.
   
     Actual Behavior
   
     - Upstream configuration retains old/stale pod IP addresses
     - Requests timeout with 504 errors
     - Manual update via APISIX Admin API required to fix
   
     Evidence
   
     Before pod restart:
     Pod IP: 10.202.10.56:7013
     APISIX Upstream: 10.202.10.56:7013 ✓
     Result: 404 (working)
   
     After pod restart:
     New Pod IP: 10.202.7.238:7013
     APISIX Upstream: 10.202.10.56:7013 ✗ (stale)
     Result: 504 Gateway Timeout
   
     Manual Fix (temporary):
     curl -X PATCH "http://127.0.0.1:9180/apisix/admin/upstreams/808482d9"; \
       -H "X-API-KEY: xxx" \
       -H "Content-Type: application/json" \
       -d 
'{"nodes":[{"host":"10.202.7.238","port":7013,"weight":100,"priority":0}]}'
   
     Impact
   
     - Severity: High - affects production workloads
     - Scope: All services behind APISIX when pods restart
     - Workaround: Manual upstream updates via admin API
   
     Controller Configuration
   
     ingress-controller:
       enabled: true
       image:
         tag: 2.0.0-rc4  # Also tested with 2.0.0-rc3
       config:
         apisix:
           adminAPIVersion: v3
           serviceName: apisix-v3-admin
           serviceNamespace: ingress-apisix
   
     Kubernetes Service/Endpoints Status
   
     Kubernetes service discovery works correctly:
     $ kubectl get endpoints carbon -n beta
     NAME     ENDPOINTS          AGE
     carbon   10.202.7.238:7013   5m
   
     Additional Context
   
     - Issue occurs systematically for all pod restarts
     - Both rc3 and rc4 versions affected
     - Kubernetes RBAC permissions appear correct
     - No endpoint-related errors in controller logs
     - Controller logs show status updates but no endpoint sync activity
   
     Related ApisixRoute Configuration
   
     apiVersion: apisix.apache.org/v2
     kind: ApisixRoute
     spec:
       http:
       - backends:
         - serviceName: carbon
           servicePort: 80
         match:
           hosts:
           - api.beta.staging.livspace.com
           paths:
           - /carbon/*
   
   
   ### Expected Behavior
   
     Expected Behavior
   
     - Ingress controller should automatically detect endpoint changes
     - Upstream configuration should be updated with new pod IP
     - Routes should continue working without manual intervention
   
   
   ### Error Logs
   
     APISIX Error Logs:
     [lua] balancer.lua:384: run(): proxy request to 10.202.10.56:7013 while 
connecting to upstream
     upstream timed out (110: Connection timed out) while connecting to upstream
   
   
   ### Steps to Reproduce
   
     Steps to Reproduce
   
     1. Deploy APISIX ingress controller 2.0.0-rc3 or 2.0.0-rc4
     2. Create an ApisixRoute pointing to a service (e.g., carbon service)
     3. Verify the route works correctly
     4. Delete the target pod to force restart: kubectl delete pod <pod-name>
     5. Wait for new pod to start with different IP address
     6. Test the route - it will return 504 Gateway Timeout
   
   
   ### Environment
   
     Environment
   
     - APISIX Version: 3.13.0
     - APISIX Ingress Controller Versions Tested: 2.0.0-rc3, 2.0.0-rc4 (both 
affected)
     - APISIX Helm Chart: 2.11.5
     - Kubernetes Version: GKE (Google Kubernetes Engine)
     - Deployment Method: Helm with Flux GitOps
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] bug: APISIX Ingress Controller fails to sync endpoint changes when pods restart (affects 2.0.0-rc3 and 2.0.0-rc4) [apisix-ingress-controller]

Reply via email to