Yicong-Huang opened a new issue, #5422:
URL: https://github.com/apache/texera/issues/5422

   ### Task Summary
   
   `access-control-service/.../AccessControlResource.scala` hosts two JAX-RS 
classes — `LiteLLMProxyResource` (`/chat/*`) and `LiteLLMModelsResource` 
(`/models`) — that exist only to forward HTTP requests to LiteLLM with the 
deployment's master API key. PR #5421 hardened these by adding 
`@RolesAllowed("REGULAR", "ADMIN")`, but a JVM service whose only job is to 
copy a request bytewise to another HTTP service is the wrong architecture:
   
   - doubles request latency (frontend → access-control-service → LiteLLM)
   - couples LLM availability to the access-control-service deployment
   - forces texera Scala code to maintain a hand-rolled HTTP proxy (headers 
strip / forward, status passthrough, error wrapping — every line a regression 
risk)
   - forces the LLM API key to live in the same process that handles unrelated 
routing concerns
   
   Possible replacements:
   
   - Frontend talks to LiteLLM directly through Envoy / API gateway, with auth 
checked at the gateway and a short-lived per-user LLM token issued by 
access-control-service
   - Move the proxy to a generic reverse-proxy in front of the cluster (NGINX, 
Envoy) with auth offloaded to JWT validation at the edge
   - Use a managed AI gateway product (Vercel AI Gateway, LiteLLM's own UI 
gateway) instead of running our own JAX-RS class
   
   PR #5421 leaves the existing proxy in place so the hardening can ship 
without a bigger architectural change; this issue tracks the replacement.
   
   ### Task Type
   
   - [x] Refactor / Cleanup


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to