zuiyangqingzhou opened a new issue, #10624:
URL: https://github.com/apache/apisix/issues/10624

   ### Current Behavior
   
   The node with exception will still be forwarded traffic.
   
   https://github.com/apache/apisix/blob/master/apisix/utils/upstream.lua#L70
   
   According to the code here, in the case where the upstream is LB or domain 
name, dns parsing will be performed, but only an IP will be returned randomly. 
   
   There is a situation in which the randomly returned node happens to be the 
exception node.
   
   ### Expected Behavior
   
   Abnormal nodes should be removed and should not receive traffic
   
   ### Error Logs
   
   2023/12/09 22:36:56 [error] 15767#89433274: *42241 [lua] balancer.lua:363: 
run(): failed to pick server: failed to find valid upstream server, all 
upstream servers tried while connecting to upstream, client: 127.0.0.1, server: 
_, request: "GET /dns/test HTTP/1.1", upstream: 
"http://192.168.247.4:80/dns/test";, host: "127.0.0.1:9080"
   
   ### Steps to Reproduce
   
   1. Prepare two domain
   ```
   $ dig @127.0.0.1 www.mytest.com
   
   www.mytest.com.              0       IN      A       192.168.247.4
   www.mytest.com.              0       IN      A       192.168.247.2
   www.mytest.com.              0       IN      A       192.168.247.3
   
   $ dig @127.0.0.1 www.mytemp.com
   
   www.mytemp.com.              0       IN      A       192.168.246.3
   www.mytemp.com.              0       IN      A       192.168.246.4
   www.mytemp.com.              0       IN      A       192.168.246.2
   ```
   2.  both domains have a faulty node
   ```
   $ curl http://192.168.247.4/
   curl: (7) Failed to connect to 192.168.247.4 port 80 after 4888 ms: Couldn't 
connect to server
   
   $ curl http://192.168.246.3/
   curl: (7) Failed to connect to 192.168.246.3 port 80 after 4888 ms: Couldn't 
connect to server
   ```
   3.  the complete configuration is as follows
   ```
   {
       "id": "490771170321239793",
       "create_time": 1702052012,
       "update_time": 1702132481,
       "uri": "/dns/test",
       "name": "dns_test",
       "methods": [
           "GET",
           "POST",
           "PUT",
           "DELETE",
           "PATCH",
           "HEAD",
           "OPTIONS",
           "CONNECT",
           "TRACE"
       ],
       "upstream": {
           "nodes": {
               "www.mytemp.com:80": 1,
               "www.mytest.com:80": 1
           },
           "timeout": {
               "connect": 6,
               "send": 6,
               "read": 6
           },
           "type": "roundrobin",
           "checks": {
               "active": {
                   "concurrency": 10,
                   "healthy": {
                       "http_statuses": [
                           200,
                           302
                       ],
                       "interval": 1,
                       "successes": 2
                   },
                   "http_path": "/aa",
                   "port": 80,
                   "timeout": 1,
                   "type": "http",
                   "unhealthy": {
                       "http_failures": 5,
                       "http_statuses": [
                           429,
                           404,
                           500,
                           501,
                           502,
                           503,
                           504,
                           505
                       ],
                       "interval": 1,
                       "tcp_failures": 2,
                       "timeouts": 3
                   }
               }
           },
           "scheme": "http",
           "pass_host": "pass",
           "keepalive_pool": {
               "idle_timeout": 60,
               "requests": 1000,
               "size": 320
           }
       },
       "status": 1
   }
   ```
   4. Initiate a request
   ```
   curl http://127.0.0.1:9080/dns/test -i
   ```
   5. there is a certain probability that an error will occur as follows
   ```
   HTTP/1.1 502 Bad Gateway
   Date: Sat, 09 Dec 2023 14:36:21 GMT
   Content-Type: text/html; charset=utf-8
   Content-Length: 154
   Connection: keep-alive
   Server: APISIX/3.7.0
   X-APISIX-Upstream-Status: 504 :
   
   <html>
   <head><title>502 Bad Gateway</title></head>
   <body>
   <center><h1>502 Bad Gateway</h1></center>
   <hr><center>openresty</center>
   </body>
   </html>
   ```
   
   ### Environment
   
   - APISIX version (run `apisix version`): APISIX/3.7.0
   - Operating system (run `uname -a`):  Darwin
   - OpenResty / Nginx version (run `openresty -V` or `nginx -V`):  nginx 
version: openresty/1.21.4.2
   - etcd version, if relevant (run `curl 
http://127.0.0.1:9090/v1/server_info`):
   - APISIX Dashboard version, if relevant:
   - Plugin runner version, for issues related to plugin runners:
   - LuaRocks version, for installation issues (run `luarocks --version`):
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to