Yicong-Huang opened a new issue, #5422:
URL: https://github.com/apache/texera/issues/5422
### Task Summary
`access-control-service/.../AccessControlResource.scala` hosts two JAX-RS
classes — `LiteLLMProxyResource` (`/chat/*`) and `LiteLLMModelsResource`
(`/models`) — that exist only to forward HTTP requests to LiteLLM with the
deployment's master API key. PR #5421 hardened these by adding
`@RolesAllowed("REGULAR", "ADMIN")`, but a JVM service whose only job is to
copy a request bytewise to another HTTP service is the wrong architecture:
- doubles request latency (frontend → access-control-service → LiteLLM)
- couples LLM availability to the access-control-service deployment
- forces texera Scala code to maintain a hand-rolled HTTP proxy (headers
strip / forward, status passthrough, error wrapping — every line a regression
risk)
- forces the LLM API key to live in the same process that handles unrelated
routing concerns
Possible replacements:
- Frontend talks to LiteLLM directly through Envoy / API gateway, with auth
checked at the gateway and a short-lived per-user LLM token issued by
access-control-service
- Move the proxy to a generic reverse-proxy in front of the cluster (NGINX,
Envoy) with auth offloaded to JWT validation at the edge
- Use a managed AI gateway product (Vercel AI Gateway, LiteLLM's own UI
gateway) instead of running our own JAX-RS class
PR #5421 leaves the existing proxy in place so the hardening can ship
without a bigger architectural change; this issue tracks the replacement.
### Task Type
- [x] Refactor / Cleanup
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]