janiussyafiq commented on PR #13578:
URL: https://github.com/apache/apisix/pull/13578#issuecomment-4806353252
@membphis
Addressed all the previous concerns. The three cache-key reports were one
root cause: the key was computed in `access` from the client request, but the
response is determined by the *effective* upstream request `ai-proxy` builds
later in `before_proxy` (after `ai-cache` runs), so server-side
`options`/`override` never reached the key.
The fingerprint now identifies that effective request. Since `final_body =
f(client_body, protocol, instance{provider, options, override})` is
deterministic, it hashes those inputs:
- **client:** protocol, messages, params
- **effective:** provider, effective model (`options.model or body.model`),
`options`, `override.llm_options`, `override.request_body` (+
`request_body_force_override`), `override.endpoint`
`scope()` is reduced to pure isolation (route / consumer / `include_vars`).
One rule closes all three cases, the plain-`ai-proxy` effective model, the
`options`/`override` params, and the earlier instance/provider collision.
Keying off the final upstream body directly isn't feasible: `before_proxy`
is where that body is built *and* where the upstream call happens, so a
read-through cache that short-circuits in `access` can't see it but hashing the
builder's inputs is equivalent.
While reworking this I also closed a related case: on `ai-proxy-multi`
failover, the fallback instance's response could be written under the
originally-picked instance's key. The log-phase write is now skipped when the
serving instance differs from the one the key was computed for. Let me know
what your view on this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]