Copilot commented on code in PR #13322:
URL: https://github.com/apache/apisix/pull/13322#discussion_r3162132003
##########
t/plugin/ai-proxy-multi3.t:
##########
@@ -1060,3 +1060,50 @@ failed to get health check target status
--- error_log
releasing existing checker
--- timeout: 5
+
+
+
+=== TEST 14: construct_upstream resolves _dns_value when nil (config table
replacement scenario)
+--- config
+ location /t {
+ content_by_lua_block {
+ local plugin = require("apisix.plugins.ai-proxy-multi")
+
+ -- Simulate an instance after config table replacement: _dns_value
is lost
+ local instance = {
+ name = "test-instance",
+ provider = "openai",
+ weight = 1,
+ priority = 0,
+ override = {
+ endpoint = "https://api.openai.com:443",
+ },
Review Comment:
This test relies on resolving an external domain (`api.openai.com`). In CI
environments without external DNS/network access, it can become flaky or fail.
Prefer using a deterministic endpoint (e.g., an IP like
`https://127.0.0.1:443`, or a locally provisioned test hostname/server) so the
test validates the `_dns_value` fallback without depending on external DNS.
##########
apisix/plugins/ai-proxy-multi.lua:
##########
@@ -253,12 +253,22 @@ local function fetch_health_instances(conf, checkers)
local port = ins.checks and ins.checks.active and
ins.checks.active.port
local node = ins._dns_value
- local ok, err = checker:get_target_status(node.host, port or
node.port, host)
- if ok then
+ if not node then
+ resolve_endpoint(ins)
+ node = ins._dns_value
+ end
+ if node then
+ local ok, err = checker:get_target_status(node.host, port or
node.port, host)
+ if ok then
+ transform_instances(new_instances, ins)
+ elseif err then
+ core.log.warn("failed to get health check target status,
addr: ",
+ node.host, ":", port or node.port, ", host: ", host,
", err: ", err)
+ end
+ else
+ core.log.warn("failed to resolve endpoint for instance: ",
ins.name,
+ ", treating as healthy")
transform_instances(new_instances, ins)
Review Comment:
When resolution still fails (node remains nil), the code includes the
instance as healthy. This can mask real misconfiguration (or resolver failures)
by keeping an unreachable instance in rotation. Safer behavior is to skip
`transform_instances()` (treat as unhealthy) or gate the 'treat as healthy'
behavior to a clearly identified transient scenario; at minimum, consider
making this fallback explicit/rare to avoid silently serving traffic to an
unresolved target.
```suggestion
", skipping instance")
```
##########
apisix/plugins/ai-proxy-multi.lua:
##########
@@ -253,12 +253,22 @@ local function fetch_health_instances(conf, checkers)
local port = ins.checks and ins.checks.active and
ins.checks.active.port
local node = ins._dns_value
- local ok, err = checker:get_target_status(node.host, port or
node.port, host)
- if ok then
+ if not node then
+ resolve_endpoint(ins)
+ node = ins._dns_value
+ end
Review Comment:
The new health-check fallback resolution path isn’t covered by the added
regression test (which only exercises `construct_upstream`). Add a test that
simulates the periodic health-check validation scenario with `_dns_value ==
nil` and asserts it no longer errors/logs the previous noisy warning, and that
the instance handling behaves as intended.
##########
t/plugin/ai-proxy-multi3.t:
##########
@@ -1060,3 +1060,50 @@ failed to get health check target status
--- error_log
releasing existing checker
--- timeout: 5
+
+
+
+=== TEST 14: construct_upstream resolves _dns_value when nil (config table
replacement scenario)
+--- config
+ location /t {
+ content_by_lua_block {
+ local plugin = require("apisix.plugins.ai-proxy-multi")
+
+ -- Simulate an instance after config table replacement: _dns_value
is lost
+ local instance = {
+ name = "test-instance",
+ provider = "openai",
+ weight = 1,
+ priority = 0,
+ override = {
+ endpoint = "https://api.openai.com:443",
+ },
+ auth = {
+ header = {
+ Authorization = "Bearer test-key",
+ },
+ },
+ }
+
+ -- Confirm _dns_value is nil (simulating config table replacement)
+ assert(instance._dns_value == nil, "_dns_value should be nil
initially")
+
+ -- construct_upstream should resolve _dns_value as fallback
+ local upstream, err = plugin.construct_upstream(instance)
+ if not upstream then
+ ngx.say("FAIL: " .. err)
+ return
+ end
+
+ -- Verify _dns_value was populated
+ assert(instance._dns_value ~= nil, "_dns_value should be set after
construct_upstream")
+
+ ngx.say("host: ", upstream.nodes[1].host)
+ ngx.say("port: ", upstream.nodes[1].port)
+ ngx.say("passed")
+ }
+ }
+--- response_body
+host: api.openai.com
+port: 443
Review Comment:
This test relies on resolving an external domain (`api.openai.com`). In CI
environments without external DNS/network access, it can become flaky or fail.
Prefer using a deterministic endpoint (e.g., an IP like
`https://127.0.0.1:443`, or a locally provisioned test hostname/server) so the
test validates the `_dns_value` fallback without depending on external DNS.
##########
apisix/plugins/ai-proxy-multi.lua:
##########
@@ -253,12 +253,22 @@ local function fetch_health_instances(conf, checkers)
local port = ins.checks and ins.checks.active and
ins.checks.active.port
local node = ins._dns_value
- local ok, err = checker:get_target_status(node.host, port or
node.port, host)
- if ok then
+ if not node then
+ resolve_endpoint(ins)
+ node = ins._dns_value
+ end
Review Comment:
The 'resolve if `_dns_value` is nil, then re-read `_dns_value`' logic is now
duplicated in multiple places. Consider extracting a small helper (e.g.,
`get_or_resolve_node(instance)`) to reduce future drift and ensure consistent
behavior between health-check logic and upstream construction.
##########
apisix/plugins/ai-proxy-multi.lua:
##########
@@ -439,7 +449,11 @@ function _M.construct_upstream(instance)
local upstream = {}
local node = instance._dns_value
if not node then
- return nil, "failed to resolve endpoint for instance: " ..
instance.name
+ resolve_endpoint(instance)
+ node = instance._dns_value
+ if not node then
+ return nil, "failed to resolve endpoint for instance: " ..
instance.name
+ end
Review Comment:
The 'resolve if `_dns_value` is nil, then re-read `_dns_value`' logic is now
duplicated in multiple places. Consider extracting a small helper (e.g.,
`get_or_resolve_node(instance)`) to reduce future drift and ensure consistent
behavior between health-check logic and upstream construction.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]