nic-6443 commented on code in PR #13322:
URL: https://github.com/apache/apisix/pull/13322#discussion_r3162274110


##########
t/plugin/ai-proxy-multi3.t:
##########
@@ -1060,3 +1060,50 @@ failed to get health check target status
 --- error_log
 releasing existing checker
 --- timeout: 5
+
+
+
+=== TEST 14: construct_upstream resolves _dns_value when nil (config table 
replacement scenario)
+--- config
+    location /t {
+        content_by_lua_block {
+            local plugin = require("apisix.plugins.ai-proxy-multi")
+
+            -- Simulate an instance after config table replacement: _dns_value 
is lost
+            local instance = {
+                name = "test-instance",
+                provider = "openai",
+                weight = 1,
+                priority = 0,
+                override = {
+                    endpoint = "https://api.openai.com:443";,
+                },

Review Comment:
   The test uses `api.openai.com` as a URL string in the endpoint config — 
`resolve_endpoint` parses it via regex, not DNS. `parse_domain_for_node` 
attempts DNS resolution but fails silently in the test env (no real resolver), 
so `node.host` stays as the domain string. This is consistent with existing 
tests in the file.



##########
apisix/plugins/ai-proxy-multi.lua:
##########
@@ -253,12 +253,22 @@ local function fetch_health_instances(conf, checkers)
             local port = ins.checks and ins.checks.active and 
ins.checks.active.port
 
             local node = ins._dns_value
-            local ok, err = checker:get_target_status(node.host, port or 
node.port, host)
-            if ok then
+            if not node then
+                resolve_endpoint(ins)
+                node = ins._dns_value
+            end
+            if node then
+                local ok, err = checker:get_target_status(node.host, port or 
node.port, host)
+                if ok then
+                    transform_instances(new_instances, ins)
+                elseif err then
+                    core.log.warn("failed to get health check target status, 
addr: ",
+                        node.host, ":", port or node.port, ", host: ", host, 
", err: ", err)
+                end
+            else
+                core.log.warn("failed to resolve endpoint for instance: ", 
ins.name,
+                    ", treating as healthy")
                 transform_instances(new_instances, ins)

Review Comment:
   Treating as healthy is intentional — if resolution fails transiently, 
dropping the instance would cause unnecessary traffic loss. The existing 
fallback at L276-281 ("all upstream nodes unhealthy, use default") already 
handles the all-fail case. This is the conservative/safe direction.



##########
apisix/plugins/ai-proxy-multi.lua:
##########
@@ -253,12 +253,22 @@ local function fetch_health_instances(conf, checkers)
             local port = ins.checks and ins.checks.active and 
ins.checks.active.port
 
             local node = ins._dns_value
-            local ok, err = checker:get_target_status(node.host, port or 
node.port, host)
-            if ok then
+            if not node then
+                resolve_endpoint(ins)
+                node = ins._dns_value
+            end

Review Comment:
   Fair point, but extracting a helper for 4 lines used in 2 places feels like 
over-abstraction. Happy to reconsider if more call sites emerge.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to