janiussyafiq commented on issue #12962: URL: https://github.com/apache/apisix/issues/12962#issuecomment-3869073945
Here's an example of your use case @RGZingYang : 1. Configure 2 routes. Each route has vars that match the model that you want. In this example I used Anthropic as my provider: ``` // first route for opus 4.6 curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT -H "X-API-KEY: XXXXX" -i -d ' { "id": "ai-proxy-claude-opus", "methods": ["POST"], "uri": "/ai", "vars": [ ["post_arg.model", "==", "claude-opus"] ], "plugins": { "ai-proxy": { "name": "claude-opus", "provider": "anthropic", "auth": { "header": { "Authorization": "Bearer '"$ANTHROPIC_API_KEY"'" } }, "options": { "model": "claude-opus-4-6" } }, "ai-rate-limiting": { "instances": [ { "name": "antropic-instance-opus", "limit": 10, "time_window": 60 } ], "limit_strategy": "total_tokens" } } } ' // second route for sonnet 4.5 curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT -H "X-API-KEY: XXXXX" -i -d ' { "id": "ai-proxy-claude-sonnet", "methods": ["POST"], "uri": "/ai", "vars": [ ["post_arg.model", "==", "claude-sonnet"] ], "plugins": { "ai-proxy": { "name": "claude-sonnet", "provider": "anthropic", "auth": { "header": { "Authorization": "Bearer '"$ANTHROPIC_API_KEY"'" } }, "options": { "model": "claude-sonnet-4-5" } }, "ai-rate-limiting": { "instances": [ { "name": "antropic-instance-sonnet", "limit": 10, "time_window": 60 } ], "limit_strategy": "total_tokens" } } } ' ``` 2. Then you can test your routes are working by sending request to both models. ``` curl -i -X POST http://127.0.0.1:9080/ai -H "Content-Type: application/json" -d '{"model": "claude-opus", "messages": [{"role": "user", "content": "hi"}]}' curl -i -X POST http://127.0.0.1:9080/ai -H "Content-Type: application/json" -d '{"model": "claude-sonnet", "messages": [{"role": "user", "content": "hi"}]}' ``` 3. You will get the following success response if all are configured correctly: ``` HTTP/1.1 200 OK Date: Mon, 09 Feb 2026 03:00:57 GMT Content-Type: application/json Transfer-Encoding: chunked Connection: keep-alive Server: APISIX/3.15.0 {"id":"msg_014vmk8dP33iGZDSWouQiiuZ","choices":[{"finish_reason":"stop","index":0,"message":{"role":"assistant","content":"Hi there! 👋 How are you doing today? Is there anything I can help you with?"}}],"created":1770606057,"model":"claude-opus-4-6","object":"chat.completion","usage":{"completion_tokens":25,"prompt_tokens":8,"total_tokens":33}}% ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
