shlok-srivastava opened a new issue, #2536:
URL: https://github.com/apache/apisix-ingress-controller/issues/2536
### Current Behavior
Problem Summary
The APISIX ingress controller does not automatically update upstream
configurations when Kubernetes pods restart and get new IP addresses, causing
504 Gateway Timeout errors until manually fixed.
Actual Behavior
- Upstream configuration retains old/stale pod IP addresses
- Requests timeout with 504 errors
- Manual update via APISIX Admin API required to fix
Evidence
Before pod restart:
Pod IP: 10.202.10.56:7013
APISIX Upstream: 10.202.10.56:7013 ✓
Result: 404 (working)
After pod restart:
New Pod IP: 10.202.7.238:7013
APISIX Upstream: 10.202.10.56:7013 ✗ (stale)
Result: 504 Gateway Timeout
Manual Fix (temporary):
curl -X PATCH "http://127.0.0.1:9180/apisix/admin/upstreams/808482d9" \
-H "X-API-KEY: xxx" \
-H "Content-Type: application/json" \
-d
'{"nodes":[{"host":"10.202.7.238","port":7013,"weight":100,"priority":0}]}'
Impact
- Severity: High - affects production workloads
- Scope: All services behind APISIX when pods restart
- Workaround: Manual upstream updates via admin API
Controller Configuration
ingress-controller:
enabled: true
image:
tag: 2.0.0-rc4 # Also tested with 2.0.0-rc3
config:
apisix:
adminAPIVersion: v3
serviceName: apisix-v3-admin
serviceNamespace: ingress-apisix
Kubernetes Service/Endpoints Status
Kubernetes service discovery works correctly:
$ kubectl get endpoints carbon -n beta
NAME ENDPOINTS AGE
carbon 10.202.7.238:7013 5m
Additional Context
- Issue occurs systematically for all pod restarts
- Both rc3 and rc4 versions affected
- Kubernetes RBAC permissions appear correct
- No endpoint-related errors in controller logs
- Controller logs show status updates but no endpoint sync activity
Related ApisixRoute Configuration
apiVersion: apisix.apache.org/v2
kind: ApisixRoute
spec:
http:
- backends:
- serviceName: carbon
servicePort: 80
match:
hosts:
- api.beta.staging.livspace.com
paths:
- /carbon/*
### Expected Behavior
Expected Behavior
- Ingress controller should automatically detect endpoint changes
- Upstream configuration should be updated with new pod IP
- Routes should continue working without manual intervention
### Error Logs
APISIX Error Logs:
[lua] balancer.lua:384: run(): proxy request to 10.202.10.56:7013 while
connecting to upstream
upstream timed out (110: Connection timed out) while connecting to upstream
### Steps to Reproduce
Steps to Reproduce
1. Deploy APISIX ingress controller 2.0.0-rc3 or 2.0.0-rc4
2. Create an ApisixRoute pointing to a service (e.g., carbon service)
3. Verify the route works correctly
4. Delete the target pod to force restart: kubectl delete pod <pod-name>
5. Wait for new pod to start with different IP address
6. Test the route - it will return 504 Gateway Timeout
### Environment
Environment
- APISIX Version: 3.13.0
- APISIX Ingress Controller Versions Tested: 2.0.0-rc3, 2.0.0-rc4 (both
affected)
- APISIX Helm Chart: 2.11.5
- Kubernetes Version: GKE (Google Kubernetes Engine)
- Deployment Method: Helm with Flux GitOps
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]