jotasixto commented on issue #2708:
URL: 
https://github.com/apache/apisix-ingress-controller/issues/2708#issuecomment-4218338000

   We experienced a very similar issue in our environment and wanted to share 
our findings, as the root cause in our case turned out to be related to **EKS 
security group rules blocking low-port traffic between worker nodes**, rather 
than a bug in APISIX or the ingress controller itself.
   
   ## Our Environment
   
   - **EKS cluster** deployed via the official [terraform-aws-eks 
module](https://github.com/terraform-aws-modules/terraform-aws-eks)
   - **APISIX standalone** deployed with Helm chart **v2.13.0**
   - Backend services exposing **port 80** on their pods (front-end Nginx 
containers with `containerPort: 80`)
   
   ## Observed Behavior
   
   We saw the same symptoms described in this issue: after pod rescheduling or 
scaling events, APISIX gateways would intermittently fail to reach backend pods 
with `(111: Connection refused)` errors, particularly when pods were placed on 
different nodes than the APISIX gateway pods.
   
   When we attempted a workaround of ensuring at least one replica of each 
backend service ran on every node, we discovered the actual underlying problem: 
**traffic on port 80 was being blocked between worker nodes by the EKS node 
security group rules**.
   
   ## Root Cause: EKS Security Group Default Rules and Low Ports
   
   The [official terraform-aws-eks documentation on network 
connectivity](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/network_connectivity.md)
 explains that the default node security group rules only allow traffic on 
**ephemeral ports (1025-65535)** between the cluster control plane and worker 
nodes, and between nodes themselves. This is by design — AWS considers it a 
best practice because **non-privileged pods should not bind to ports below 
1024**.
   
   Looking at the [security group 
diagram](https://raw.githubusercontent.com/terraform-aws-modules/terraform-aws-eks/master/.github/images/security_groups.svg)
 from the module documentation, port 80 is simply not in the allowed range for 
node-to-node or cluster-to-node ingress traffic.
   
   This means:
   - When an APISIX gateway pod on **Node A** tries to reach a backend pod on 
**Node B** using port 80, the traffic is **silently dropped** by the node 
security group.
   - When both APISIX and the backend pod happen to be on the **same node**, 
traffic works fine (it stays within the node and doesn't cross the security 
group boundary).
   - This creates the **intermittent** behavior: it works or fails depending on 
pod placement, which changes with scaling events, node rotation, etc.
   
   ## Our Fix
   
   We added custom security group rules to explicitly allow port 80 traffic 
between worker nodes 
([Automya/claims#185](https://github.com/Automya/claims/pull/185)):
   
   ```yaml
   node_security_group_additional_rules:
     ingress_node_ports_fronts:
       description                   : "Allow port 80 from cluster to worker 
nodes"
       protocol                      : "tcp"
       from_port                     : 80
       to_port                       : 80
       type                          : "ingress"
       source_cluster_security_group : true
     ingress_node_ports_fronts_self:
       description                   : "Allow port 80 between worker nodes"
       protocol                      : "tcp"
       from_port                     : 80
       to_port                       : 80
       type                          : "ingress"
       self                          : true
   ```
   
   After applying these rules, the issue was **fully resolved** — APISIX 
gateways could reach backend pods on any node in the cluster without 
`Connection refused` errors, regardless of pod placement.
   
   ## Recommendation
   
   If you're running on EKS (especially with the terraform-aws-eks module) and 
your backend services expose pods on **port 80 or any port below 1024**, check 
your node security group rules. The default rules only allow ephemeral ports 
(1025-65535), and traffic to low ports between nodes will be silently dropped.
   
   The proper long-term fix is to **migrate backend services to listen on high 
ports (≥1024)**, which aligns with AWS best practices for non-privileged pods. 
The custom security group rules above are a valid workaround if migrating ports 
immediately is not feasible.
   
   **TL;DR**: In our case, APISIX was working correctly — it was the EKS node 
security groups blocking cross-node traffic on port 80 that caused the 
intermittent `Connection refused` errors after pod rescheduling.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to