(apisix) branch master updated: docs: fix ai-proxy-multi attribute nesting and add missing health check sub-attributes (#13169)

yilialin Fri, 17 Apr 2026 01:30:49 -0700

This is an automated email from the ASF dual-hosted git repository.

yilialin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/apisix.git



The following commit(s) were added to refs/heads/master by this push:
     new 05a9bf252 docs: fix ai-proxy-multi attribute nesting and add missing 
health check sub-attributes (#13169)
05a9bf252 is described below

commit 05a9bf2526cdc89d1b0fac0cdae7082b281d06f7
Author: Yilia Lin <[email protected]>
AuthorDate: Fri Apr 17 16:30:31 2026 +0800

    docs: fix ai-proxy-multi attribute nesting and add missing health check 
sub-attributes (#13169)
---
 docs/en/latest/plugins/ai-proxy-multi.md | 1678 ++++++++++++++++++++++++++++-
 docs/zh/latest/plugins/ai-proxy-multi.md | 1697 +++++++++++++++++++++++++++++-
 2 files changed, 3293 insertions(+), 82 deletions(-)

diff --git a/docs/en/latest/plugins/ai-proxy-multi.md 
b/docs/en/latest/plugins/ai-proxy-multi.md
index dedde607c..2a71760b3 100644
--- a/docs/en/latest/plugins/ai-proxy-multi.md
+++ b/docs/en/latest/plugins/ai-proxy-multi.md
@@ -33,6 +33,9 @@ description: The ai-proxy-multi Plugin extends the 
capabilities of ai-proxy with
   <link rel="canonical" href="https://docs.api7.ai/hub/ai-proxy-multi"; />
 </head>
 
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
 ## Description
 
 The `ai-proxy-multi` Plugin simplifies access to LLM and embedding models by 
transforming Plugin configurations into the designated request format for 
OpenAI, DeepSeek, Azure, AIMLAPI, Anthropic, OpenRouter, Gemini, Vertex AI, and 
other OpenAI-compatible APIs. It extends the capabilities of 
[`ai-proxy`](./ai-proxy.md) with load balancing, retries, fallbacks, and health 
checks.
@@ -49,9 +52,9 @@ In addition, the Plugin also supports logging LLM request 
information in the acc
 
 ## Attributes
 
-| Name                               | Type            | Required | Default    
                       | Valid Values | Description |
+| Name                               | Type            | Required | Default    
                       | Valid values | Description |
 
|------------------------------------|----------------|----------|-----------------------------------|--------------|-------------|
-| fallback_strategy                  | string or array         | False    |  | 
string: "instance_health_and_rate_limiting", "http_429", "http_5xx"<br />array: 
["rate_limiting", "http_429", "http_5xx"] | Fallback strategy. When set, the 
Plugin will check whether the specified instance’s token has been exhausted 
when a request is forwarded. If so, forward the request to the next instance 
regardless of the instance priority. When not set, the Plugin will not forward 
the request to low prior [...]
+| fallback_strategy                  | string or array         | False    |  | 
string: "instance_health_and_rate_limiting", "http_429", "http_5xx"<br />array: 
["rate_limiting", "http_429", "http_5xx"] | Fallback strategy. When set, the 
Plugin will check whether the specified instance's token has been exhausted 
when a request is forwarded. If so, forward the request to the next instance 
regardless of the instance priority. When not set, the Plugin will not forward 
the request to low prior [...]
 | balancer                           | object         | False    |             
                      |              | Load balancing configurations. |
 | balancer.algorithm                 | string         | False    | roundrobin  
                   | [roundrobin, chash] | Load balancing algorithm. When set 
to `roundrobin`, weighted round robin algorithm is used. When set to `chash`, 
consistent hashing algorithm is used. |
 | balancer.hash_on                   | string         | False    |             
                      | [vars, headers, cookie, consumer, vars_combinations] | 
Used when `type` is `chash`. Support hashing on [NGINX 
variables](https://nginx.org/en/docs/varindex.html), headers, cookie, consumer, 
or a combination of [NGINX variables](https://nginx.org/en/docs/varindex.html). 
|
@@ -73,31 +76,29 @@ In addition, the Plugin also supports logging LLM request 
information in the acc
 | instances.auth.gcp.expire_early_secs| integer        | False    | 60         
                       | minimum = 0  | Seconds to expire the access token 
before its actual expiration time to avoid edge cases. |
 | instances.options                   | object         | False    |            
                       |              | Model configurations. In addition to 
`model`, you can configure additional parameters and they will be forwarded to 
the upstream LLM service in the request body. For instance, if you are working 
with OpenAI, DeepSeek, or AIMLAPI, you can configure additional parameters such 
as `max_tokens`, `temperature`, `top_p`, and `stream`. See your LLM provider's 
API documentation f [...]
 | instances.options.model             | string         | False    |            
                       |              | Name of the LLM model, such as `gpt-4` 
or `gpt-3.5`. See your LLM provider's API documentation for more available 
models. |
-| instances.override                  | object         | False    |            
                       |              | Override setting. |
-| instances.override.endpoint         | string         | False    |            
                       |              | LLM provider endpoint to replace the 
default endpoint with. If not configured, the Plugin uses the default OpenAI 
endpoint `https://api.openai.com/v1/chat/completions`. |
-| logging                             | object         | False    |            
                       |              | Logging configurations. Does not affect 
`error.log`. |
-| logging.summaries                   | boolean        | False    | false      
                     |              | If true, logs request LLM model, 
duration, request, and response tokens. |
-| logging.payloads                    | boolean        | False    | false      
                     |              | If true, logs request and response 
payload. |
-| checks                              | object         | False    |            
                       |              | Health check configurations. Note that 
at the moment, OpenAI, DeepSeek, and AIMLAPI do not provide an official health 
check endpoint. Other LLM services that you can configure under 
`openai-compatible` provider may have available health check endpoints. |
-| checks.active                       | object         | True     |            
                       |              | Active health check configurations. |
-| checks.active.type                  | string         | False    | http       
                     | [http, https, tcp] | Type of health check connection. |
-| checks.active.timeout               | number         | False    | 1          
                     |              | Health check timeout in seconds. |
-| checks.active.concurrency           | integer        | False    | 10         
                     |              | Number of upstream nodes to be checked at 
the same time. |
-| checks.active.host                  | string         | False    |            
                       |              | HTTP host. |
-| checks.active.port                  | integer        | False    |            
                       | between 1 and 65535 inclusive | HTTP port. |
-| checks.active.http_path             | string         | False    | /          
                     |              | Path for HTTP probing requests. |
-| checks.active.https_verify_certificate | boolean   | False    | true         
                   |              | If true, verify the node's TLS certificate. 
|
-| checks.active.req_headers           | array[string]  | False    |            
                       |              | Additional request headers for the 
active health check probe. |
-| checks.active.healthy               | object         | False    |            
                       |              | Healthy check configurations. |
-| checks.active.healthy.interval      | integer        | False    | 1          
                     | minimum = 1  | Time interval of checking healthy nodes, 
in seconds. |
-| checks.active.healthy.http_statuses | array[integer] | False    | [200, 302] 
                     | between 200 and 599 | HTTP status codes defining a 
healthy node. |
-| checks.active.healthy.successes     | integer        | False    | 2          
                     | between 1 and 254 | Number of successful probes to 
define a healthy node. |
-| checks.active.unhealthy             | object         | False    |            
                       |              | Unhealthy check configurations. |
-| checks.active.unhealthy.interval    | integer        | False    | 1          
                     | minimum = 1  | Time interval of checking unhealthy 
nodes, in seconds. |
-| checks.active.unhealthy.http_statuses | array[integer] | False  | [429, 404, 
500, 501, 502, 503, 504, 505] | between 200 and 599 | HTTP status codes 
defining an unhealthy node. |
-| checks.active.unhealthy.http_failures | integer      | False    | 5          
                     | between 1 and 254 | Number of HTTP failures to define an 
unhealthy node. |
-| checks.active.unhealthy.tcp_failures | integer       | False    | 2          
                     | between 1 and 254 | Number of TCP failures to define an 
unhealthy node. |
-| checks.active.unhealthy.timeouts    | integer        | False    | 3          
                     | between 1 and 254 | Number of probe timeouts to define 
an unhealthy node. |
+| logging                             | object         | False    |            
                       |              | Logging configurations. |
+| logging.summaries                   | boolean        | False    | false      
                     |              | If true, log request LLM model, duration, 
request, and response tokens. |
+| logging.payloads                    | boolean        | False    | false      
                     |              | If true, log request and response 
payload. |
+| instances.override                    | object         | False    |          
                         |              | Override setting. |
+| instances.override.endpoint           | string         | False    |          
                         |              | LLM provider endpoint to replace the 
default endpoint with. If not configured, the Plugin uses the default OpenAI 
endpoint `https://api.openai.com/v1/chat/completions`. |
+| instances.checks                              | object         | False    |  
                                 |              | Health check configurations. 
Note that at the moment, OpenAI, DeepSeek, and AIMLAPI do not provide an 
official health check endpoint. Other LLM services that you can configure under 
`openai-compatible` provider may have available health check endpoints. |
+| instances.checks.active                       | object         | True     |  
                                 |              | Active health check 
configurations. |
+| instances.checks.active.type                  | string         | False    | 
http                            | [http, https, tcp] | Type of health check 
connection. |
+| instances.checks.active.timeout               | number         | False    | 
1                               |              | Health check timeout in 
seconds. |
+| instances.checks.active.concurrency           | integer        | False    | 
10                              |              | Number of upstream nodes to be 
checked at the same time. |
+| instances.checks.active.host                  | string         | False    |  
                                 |              | HTTP host. |
+| instances.checks.active.port                  | integer        | False    |  
                                 | between 1 and 65535 inclusive | HTTP port. |
+| instances.checks.active.http_path             | string         | False    | 
/                               |              | Path for HTTP probing 
requests. |
+| instances.checks.active.https_verify_certificate | boolean   | False    | 
true                            |              | If true, verify the node's TLS 
certificate. |
+| instances.checks.active.healthy               | object         | False    |  
                                 |              | Healthy check configurations. 
|
+| instances.checks.active.healthy.interval      | integer        | False    | 
1                               |              | Time interval of checking 
healthy nodes, in seconds. |
+| instances.checks.active.healthy.http_statuses | array[integer] | False    | 
[200,302]                       | status code between 200 and 599 inclusive | 
An array of HTTP status codes that defines a healthy node. |
+| instances.checks.active.healthy.successes     | integer        | False    | 
2                               | between 1 and 254 inclusive | Number of 
successful probes to define a healthy node. |
+| instances.checks.active.unhealthy             | object         | False    |  
                                 |              | Unhealthy check 
configurations. |
+| instances.checks.active.unhealthy.interval    | integer        | False    | 
1                               |              | Time interval of checking 
unhealthy nodes, in seconds. |
+| instances.checks.active.unhealthy.http_statuses | array[integer] | False  | 
[429,404,500,501,502,503,504,505] | status code between 200 and 599 inclusive | 
An array of HTTP status codes that defines an unhealthy node. |
+| instances.checks.active.unhealthy.http_failures | integer      | False    | 
5                               | between 1 and 254 inclusive | Number of HTTP 
failures to define an unhealthy node. |
+| instances.checks.active.unhealthy.timeout     | integer        | False    | 
3                               | between 1 and 254 inclusive | Number of probe 
timeouts to define an unhealthy node. |
 | timeout                             | integer        | False    | 30000      
                     | greater than or equal to 1 | Request timeout in 
milliseconds when requesting the LLM service. |
 | keepalive                           | boolean        | False    | true       
                     |              | If true, keep the connection alive when 
requesting the LLM service. |
 | keepalive_timeout                   | integer        | False    | 60000      
                     | greater than or equal to 1000 | Request timeout in 
milliseconds when requesting the LLM service. |
@@ -126,6 +127,17 @@ For demonstration and easier differentiation, you will be 
configuring one OpenAI
 
 Create a Route as such and update with your LLM providers, models, API keys, 
and endpoints if applicable:
 
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
+
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   -H "X-API-KEY: ${admin_key}" \
@@ -168,6 +180,166 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   }'
 ```
 
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+services:
+  - name: ai-proxy-multi-service
+    routes:
+      - name: ai-proxy-multi-route
+        uris:
+          - /anything
+        methods:
+          - POST
+        plugins:
+          ai-proxy-multi:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 8
+                auth:
+                  header:
+                    Authorization: "Bearer ${OPENAI_API_KEY}"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 2
+                auth:
+                  header:
+                    Authorization: "Bearer ${DEEPSEEK_API_KEY}"
+                options:
+                  model: deepseek-chat
+```
+
+Synchronize the configuration to the gateway:
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: PluginConfig
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-plugin-config
+spec:
+  plugins:
+    - name: ai-proxy-multi
+      config:
+        instances:
+          - name: openai-instance
+            provider: openai
+            weight: 8
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: gpt-4
+          - name: deepseek-instance
+            provider: deepseek
+            weight: 2
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: deepseek-chat
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  parentRefs:
+    - name: apisix
+  rules:
+    - matches:
+        - path:
+            type: Exact
+            value: /anything
+          method: POST
+      filters:
+        - type: ExtensionRef
+          extensionRef:
+            group: apisix.apache.org
+            kind: PluginConfig
+            name: ai-proxy-multi-plugin-config
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  ingressClassName: apisix
+  http:
+    - name: ai-proxy-multi-route
+      match:
+        paths:
+          - /anything
+        methods:
+          - POST
+      plugins:
+        - name: ai-proxy-multi
+          enable: true
+          config:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 8
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 2
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: deepseek-chat
+```
+
+</TabItem>
+
+</Tabs>
+
+Apply the configuration to your cluster:
+
+```shell
+kubectl apply -f ai-proxy-multi-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
 Send 10 POST requests to the Route with a system prompt and a sample user 
question in the request body, to see the number of requests forwarded to OpenAI 
and DeepSeek:
 
 ```shell
@@ -208,6 +380,17 @@ The following example demonstrates how you can configure 
two models with differe
 
 Create a Route as such and update with your LLM providers, models, API keys, 
and endpoints if applicable:
 
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
+
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   -H "X-API-KEY: ${admin_key}" \
@@ -217,7 +400,7 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
     "methods": ["POST"],
     "plugins": {
       "ai-proxy-multi": {
-        "fallback_strategy: ["rate_limiting"],
+        "fallback_strategy": ["rate_limiting"],
         "instances": [
           {
             "name": "openai-instance",
@@ -263,6 +446,199 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   }'
 ```
 
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+services:
+  - name: ai-proxy-multi-service
+    routes:
+      - name: ai-proxy-multi-route
+        uris:
+          - /anything
+        methods:
+          - POST
+        plugins:
+          ai-proxy-multi:
+            fallback_strategy:
+              - rate_limiting
+            instances:
+              - name: openai-instance
+                provider: openai
+                priority: 1
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${OPENAI_API_KEY}"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                priority: 0
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${DEEPSEEK_API_KEY}"
+                options:
+                  model: deepseek-chat
+          ai-rate-limiting:
+            instances:
+              - name: openai-instance
+                limit: 10
+                time_window: 60
+            limit_strategy: total_tokens
+```
+
+Synchronize the configuration to the gateway:
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: PluginConfig
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-plugin-config
+spec:
+  plugins:
+    - name: ai-proxy-multi
+      config:
+        fallback_strategy:
+          - rate_limiting
+        instances:
+          - name: openai-instance
+            provider: openai
+            priority: 1
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: gpt-4
+          - name: deepseek-instance
+            provider: deepseek
+            priority: 0
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: deepseek-chat
+    - name: ai-rate-limiting
+      config:
+        instances:
+          - name: openai-instance
+            limit: 10
+            time_window: 60
+        limit_strategy: total_tokens
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  parentRefs:
+    - name: apisix
+  rules:
+    - matches:
+        - path:
+            type: Exact
+            value: /anything
+          method: POST
+      filters:
+        - type: ExtensionRef
+          extensionRef:
+            group: apisix.apache.org
+            kind: PluginConfig
+            name: ai-proxy-multi-plugin-config
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  ingressClassName: apisix
+  http:
+    - name: ai-proxy-multi-route
+      match:
+        paths:
+          - /anything
+        methods:
+          - POST
+      plugins:
+        - name: ai-proxy-multi
+          enable: true
+          config:
+            fallback_strategy:
+              - rate_limiting
+            instances:
+              - name: openai-instance
+                provider: openai
+                priority: 1
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                priority: 0
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: deepseek-chat
+        - name: ai-rate-limiting
+          enable: true
+          config:
+            instances:
+              - name: openai-instance
+                limit: 10
+                time_window: 60
+            limit_strategy: total_tokens
+```
+
+</TabItem>
+
+</Tabs>
+
+Apply the configuration to your cluster:
+
+```shell
+kubectl apply -f ai-proxy-multi-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
 Send a POST request to the Route with a system prompt and a sample user 
question in the request body:
 
 ```shell
@@ -316,7 +692,7 @@ You should receive a response similar to the following:
 
 Since the `total_tokens` value exceeds the configured quota of `10`, the next 
request within the 60-second window is expected to be forwarded to the other 
instance.
 
-Within the same 60-second window, send another POST request to the route:
+Within the same 60-second window, send another POST request to the Route:
 
 ```shell
 curl "http://127.0.0.1:9080/anything"; -X POST \
@@ -351,10 +727,21 @@ You should see a response similar to the following:
 
 ### Load Balance and Rate Limit by Consumers
 
-The following example demonstrates how you can configure two models for load 
balancing and apply rate limiting by consumer.
+The following example demonstrates how you can configure two models for load 
balancing and apply rate limiting by Consumer.
 
 Create a Consumer `johndoe` and a rate limiting quota of 10 tokens in a 
60-second window on `openai-instance` instance:
 
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
+
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/consumers"; -X PUT \
   -H "X-API-KEY: ${admin_key}" \
@@ -376,7 +763,7 @@ curl "http://127.0.0.1:9180/apisix/admin/consumers"; -X PUT \
   }'
 ```
 
-Configure `key-auth` credential for `johndoe`:
+Configure `key-auth` Credential for `johndoe`:
 
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/consumers/johndoe/credentials"; -X PUT 
\
@@ -397,7 +784,7 @@ Create another Consumer `janedoe` and a rate limiting quota 
of 10 tokens in a 60
 curl "http://127.0.0.1:9180/apisix/admin/consumers"; -X PUT \
   -H "X-API-KEY: ${admin_key}" \
   -d '{
-    "username": "johndoe",
+    "username": "janedoe",
     "plugins": {
       "ai-rate-limiting": {
         "instances": [
@@ -414,7 +801,7 @@ curl "http://127.0.0.1:9180/apisix/admin/consumers"; -X PUT \
   }'
 ```
 
-Configure `key-auth` credential for `janedoe`:
+Configure `key-auth` Credential for `janedoe`:
 
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/consumers/janedoe/credentials"; -X PUT 
\
@@ -429,8 +816,183 @@ curl 
"http://127.0.0.1:9180/apisix/admin/consumers/janedoe/credentials"; -X PUT \
   }'
 ```
 
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+consumers:
+  - username: johndoe
+    plugins:
+      ai-rate-limiting:
+        instances:
+          - name: openai-instance
+            limit: 10
+            time_window: 60
+        rejected_code: 429
+        limit_strategy: total_tokens
+    credentials:
+      - name: key-auth
+        type: key-auth
+        config:
+          key: john-key
+  - username: janedoe
+    plugins:
+      ai-rate-limiting:
+        instances:
+          - name: deepseek-instance
+            limit: 10
+            time_window: 60
+        rejected_code: 429
+        limit_strategy: total_tokens
+    credentials:
+      - name: key-auth
+        type: key-auth
+        config:
+          key: jane-key
+```
+
+Synchronize the configuration to the gateway:
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-consumer-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: Consumer
+metadata:
+  namespace: aic
+  name: johndoe
+spec:
+  gatewayRef:
+    name: apisix
+  plugins:
+    - name: ai-rate-limiting
+      config:
+        instances:
+          - name: openai-instance
+            limit: 10
+            time_window: 60
+        rejected_code: 429
+        limit_strategy: total_tokens
+  credentials:
+    - type: key-auth
+      name: primary-key
+      config:
+        key: john-key
+---
+apiVersion: apisix.apache.org/v1alpha1
+kind: Consumer
+metadata:
+  namespace: aic
+  name: janedoe
+spec:
+  gatewayRef:
+    name: apisix
+  plugins:
+    - name: ai-rate-limiting
+      config:
+        instances:
+          - name: deepseek-instance
+            limit: 10
+            time_window: 60
+        rejected_code: 429
+        limit_strategy: total_tokens
+  credentials:
+    - type: key-auth
+      name: primary-key
+      config:
+        key: jane-key
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-consumer-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixConsumer
+metadata:
+  namespace: aic
+  name: johndoe
+spec:
+  ingressClassName: apisix
+  authParameter:
+    keyAuth:
+      value:
+        key: john-key
+  plugins:
+    ai-rate-limiting:
+      instances:
+        - name: openai-instance
+          limit: 10
+          time_window: 60
+      rejected_code: 429
+      limit_strategy: total_tokens
+---
+apiVersion: apisix.apache.org/v2
+kind: ApisixConsumer
+metadata:
+  namespace: aic
+  name: janedoe
+spec:
+  ingressClassName: apisix
+  authParameter:
+    keyAuth:
+      value:
+        key: jane-key
+  plugins:
+    ai-rate-limiting:
+      instances:
+        - name: deepseek-instance
+          limit: 10
+          time_window: 60
+      rejected_code: 429
+      limit_strategy: total_tokens
+```
+
+</TabItem>
+
+</Tabs>
+
+Apply the configuration to your cluster:
+
+```shell
+kubectl apply -f ai-proxy-multi-consumer-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
 Create a Route as such and update with your LLM providers, models, API keys, 
and endpoints if applicable:
 
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
+
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   -H "X-API-KEY: ${admin_key}" \
@@ -441,7 +1003,7 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
     "plugins": {
       "key-auth": {},
       "ai-proxy-multi": {
-        "fallback_strategy: ["rate_limiting"],
+        "fallback_strategy": ["rate_limiting"],
         "instances": [
           {
             "name": "openai-instance",
@@ -475,7 +1037,180 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   }'
 ```
 
-Send a POST request to the Route without any consumer key:
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+services:
+  - name: ai-proxy-multi-service
+    routes:
+      - name: ai-proxy-multi-route
+        uris:
+          - /anything
+        methods:
+          - POST
+        plugins:
+          key-auth: {}
+          ai-proxy-multi:
+            fallback_strategy:
+              - rate_limiting
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${OPENAI_API_KEY}"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${DEEPSEEK_API_KEY}"
+                options:
+                  model: deepseek-chat
+```
+
+Synchronize the configuration to the gateway:
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: PluginConfig
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-plugin-config
+spec:
+  plugins:
+    - name: key-auth
+      config:
+        _meta:
+          disable: false
+    - name: ai-proxy-multi
+      config:
+        fallback_strategy:
+          - rate_limiting
+        instances:
+          - name: openai-instance
+            provider: openai
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: gpt-4
+          - name: deepseek-instance
+            provider: deepseek
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: deepseek-chat
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  parentRefs:
+    - name: apisix
+  rules:
+    - matches:
+        - path:
+            type: Exact
+            value: /anything
+          method: POST
+      filters:
+        - type: ExtensionRef
+          extensionRef:
+            group: apisix.apache.org
+            kind: PluginConfig
+            name: ai-proxy-multi-plugin-config
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  ingressClassName: apisix
+  http:
+    - name: ai-proxy-multi-route
+      match:
+        paths:
+          - /anything
+        methods:
+          - POST
+      plugins:
+        - name: key-auth
+          enable: true
+        - name: ai-proxy-multi
+          enable: true
+          config:
+            fallback_strategy:
+              - rate_limiting
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: deepseek-chat
+```
+
+</TabItem>
+
+</Tabs>
+
+Apply the configuration to your cluster:
+
+```shell
+kubectl apply -f ai-proxy-multi-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
+Send a POST request to the Route without any Consumer key:
 
 ```shell
 curl -i "http://127.0.0.1:9080/anything"; -X POST \
@@ -661,7 +1396,7 @@ You should see a response similar to the following:
 }
 ```
 
-This shows `ai-proxy-multi` load balance the traffic with respect to the rate 
limiting rules in `ai-rate-limiting` by consumers.
+This shows `ai-proxy-multi` load balance the traffic with respect to the rate 
limiting rules in `ai-rate-limiting` by Consumers.
 
 ### Restrict Maximum Number of Completion Tokens
 
@@ -671,6 +1406,17 @@ For demonstration and easier differentiation, you will be 
configuring one OpenAI
 
 Create a Route as such and update with your LLM providers, models, API keys, 
and endpoints if applicable:
 
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
+
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   -H "X-API-KEY: ${admin_key}" \
@@ -715,6 +1461,172 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   }'
 ```
 
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+services:
+  - name: ai-proxy-multi-service
+    routes:
+      - name: ai-proxy-multi-route
+        uris:
+          - /anything
+        methods:
+          - POST
+        plugins:
+          ai-proxy-multi:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${OPENAI_API_KEY}"
+                options:
+                  model: gpt-4
+                  max_tokens: 50
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${DEEPSEEK_API_KEY}"
+                options:
+                  model: deepseek-chat
+                  max_tokens: 100
+```
+
+Synchronize the configuration to the gateway:
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: PluginConfig
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-plugin-config
+spec:
+  plugins:
+    - name: ai-proxy-multi
+      config:
+        instances:
+          - name: openai-instance
+            provider: openai
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: gpt-4
+              max_tokens: 50
+          - name: deepseek-instance
+            provider: deepseek
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: deepseek-chat
+              max_tokens: 100
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  parentRefs:
+    - name: apisix
+  rules:
+    - matches:
+        - path:
+            type: Exact
+            value: /anything
+          method: POST
+      filters:
+        - type: ExtensionRef
+          extensionRef:
+            group: apisix.apache.org
+            kind: PluginConfig
+            name: ai-proxy-multi-plugin-config
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  ingressClassName: apisix
+  http:
+    - name: ai-proxy-multi-route
+      match:
+        paths:
+          - /anything
+        methods:
+          - POST
+      plugins:
+        - name: ai-proxy-multi
+          enable: true
+          config:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: gpt-4
+                  max_tokens: 50
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: deepseek-chat
+                  max_tokens: 100
+```
+
+</TabItem>
+
+</Tabs>
+
+Apply the configuration to your cluster:
+
+```shell
+kubectl apply -f ai-proxy-multi-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
 Send a POST request to the Route with a system prompt and a sample user 
question in the request body:
 
 ```shell
@@ -803,6 +1715,17 @@ The following example demonstrates how you can configure 
the `ai-proxy-multi` Pl
 
 Create a Route as such and update with your LLM providers, embedding models, 
API keys, and endpoints:
 
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
+
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   -H "X-API-KEY: ${admin_key}" \
@@ -851,6 +1774,178 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   }'
 ```
 
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+services:
+  - name: ai-proxy-multi-service
+    routes:
+      - name: ai-proxy-multi-route
+        uris:
+          - /anything
+        methods:
+          - POST
+        plugins:
+          ai-proxy-multi:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${OPENAI_API_KEY}"
+                options:
+                  model: text-embedding-3-small
+                override:
+                  endpoint: "https://api.openai.com/v1/embeddings";
+              - name: az-openai-instance
+                provider: azure-openai
+                weight: 0
+                auth:
+                  header:
+                    api-key: "${AZ_OPENAI_API_KEY}"
+                options:
+                  model: text-embedding-3-small
+                override:
+                  endpoint: 
"https://ai-plugin-developer.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15";
+```
+
+Synchronize the configuration to the gateway:
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: PluginConfig
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-plugin-config
+spec:
+  plugins:
+    - name: ai-proxy-multi
+      config:
+        instances:
+          - name: openai-instance
+            provider: openai
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: text-embedding-3-small
+            override:
+              endpoint: "https://api.openai.com/v1/embeddings";
+          - name: az-openai-instance
+            provider: azure-openai
+            weight: 0
+            auth:
+              header:
+                api-key: "your-api-key"
+            options:
+              model: text-embedding-3-small
+            override:
+              endpoint: 
"https://ai-plugin-developer.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15";
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  parentRefs:
+    - name: apisix
+  rules:
+    - matches:
+        - path:
+            type: Exact
+            value: /anything
+          method: POST
+      filters:
+        - type: ExtensionRef
+          extensionRef:
+            group: apisix.apache.org
+            kind: PluginConfig
+            name: ai-proxy-multi-plugin-config
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  ingressClassName: apisix
+  http:
+    - name: ai-proxy-multi-route
+      match:
+        paths:
+          - /anything
+        methods:
+          - POST
+      plugins:
+        - name: ai-proxy-multi
+          enable: true
+          config:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: text-embedding-3-small
+                override:
+                  endpoint: "https://api.openai.com/v1/embeddings";
+              - name: az-openai-instance
+                provider: azure-openai
+                weight: 0
+                auth:
+                  header:
+                    api-key: "your-api-key"
+                options:
+                  model: text-embedding-3-small
+                override:
+                  endpoint: 
"https://ai-plugin-developer.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15";
+```
+
+</TabItem>
+
+</Tabs>
+
+Apply the configuration to your cluster:
+
+```shell
+kubectl apply -f ai-proxy-multi-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
 Send a POST request to the Route with an input string:
 
 ```shell
@@ -895,6 +1990,17 @@ The following example demonstrates how you can configure 
the `ai-proxy-multi` Pl
 
 Create a Route as such and update the LLM providers, embedding models, API 
keys, and health check related configurations:
 
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
+
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   -H "X-API-KEY: ${admin_key}" \
@@ -952,6 +2058,199 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   }'
 ```
 
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+services:
+  - name: ai-proxy-multi-service
+    routes:
+      - name: ai-proxy-multi-route
+        uris:
+          - /anything
+        methods:
+          - POST
+        plugins:
+          ai-proxy-multi:
+            instances:
+              - name: llm-instance-1
+                provider: openai-compatible
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${YOUR_LLM_API_KEY}"
+                options:
+                  model: "${YOUR_LLM_MODEL}"
+              - name: llm-instance-2
+                provider: openai-compatible
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${YOUR_LLM_API_KEY}"
+                options:
+                  model: "${YOUR_LLM_MODEL}"
+                checks:
+                  active:
+                    type: https
+                    host: yourhost.com
+                    http_path: /your/probe/path
+                    healthy:
+                      interval: 2
+                      successes: 1
+                    unhealthy:
+                      interval: 1
+                      http_failures: 3
+```
+
+Synchronize the configuration to the gateway:
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: PluginConfig
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-plugin-config
+spec:
+  plugins:
+    - name: ai-proxy-multi
+      config:
+        instances:
+          - name: llm-instance-1
+            provider: openai-compatible
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: your-model
+          - name: llm-instance-2
+            provider: openai-compatible
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: your-model
+            checks:
+              active:
+                type: https
+                host: yourhost.com
+                http_path: /your/probe/path
+                healthy:
+                  interval: 2
+                  successes: 1
+                unhealthy:
+                  interval: 1
+                  http_failures: 3
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  parentRefs:
+    - name: apisix
+  rules:
+    - matches:
+        - path:
+            type: Exact
+            value: /anything
+          method: POST
+      filters:
+        - type: ExtensionRef
+          extensionRef:
+            group: apisix.apache.org
+            kind: PluginConfig
+            name: ai-proxy-multi-plugin-config
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  ingressClassName: apisix
+  http:
+    - name: ai-proxy-multi-route
+      match:
+        paths:
+          - /anything
+        methods:
+          - POST
+      plugins:
+        - name: ai-proxy-multi
+          enable: true
+          config:
+            instances:
+              - name: llm-instance-1
+                provider: openai-compatible
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: your-model
+              - name: llm-instance-2
+                provider: openai-compatible
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: your-model
+                checks:
+                  active:
+                    type: https
+                    host: yourhost.com
+                    http_path: /your/probe/path
+                    healthy:
+                      interval: 2
+                      successes: 1
+                    unhealthy:
+                      interval: 1
+                      http_failures: 3
+```
+
+</TabItem>
+
+</Tabs>
+
+Apply the configuration to your cluster:
+
+```shell
+kubectl apply -f ai-proxy-multi-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
 For verification, the behaviours should be consistent with the verification in 
[active health checks](../tutorials/health-check.md).
 
 ### Include LLM Information in Access Log
@@ -959,7 +2258,7 @@ For verification, the behaviours should be consistent with 
the verification in [
 The following example demonstrates how you can log LLM request related 
information in the gateway's access log to improve analytics and audit. The 
following variables are available:
 
 * `request_llm_model`: LLM model name specified in the request.
-* `apisix_upstream_response_time`: Time taken for APISIX to send the request 
to the upstream service and receive the full response
+* `apisix_upstream_response_time`: Time taken for APISIX to send the request 
to the upstream service and receive the full response.
 * `request_type`: Type of request, where the value could be 
`traditional_http`, `ai_chat`, or `ai_stream`.
 * `llm_time_to_first_token`: Duration from request sending to the first token 
received from the LLM service, in milliseconds.
 * `llm_model`: LLM model.
@@ -1017,3 +2316,308 @@ In the gateway's access log, you should see a log entry 
similar to the following
 ```
 
 The access log entry shows the request type is `ai_chat`, Apisix upstream 
response time is `5765` milliseconds, time to first token is `2858` 
milliseconds, Requested LLM model is `gpt-4`. LLM model is `gpt-4`, prompt 
token usage is `23`, and completion token usage is `8`.
+
+### Send Request Log to Logger
+
+The following example demonstrates how you can log request and request 
information, including LLM model, token, and payload, and push them to a 
logger. Before proceeding, you should first set up a logger, such as Kafka. See 
[`kafka-logger`](./kafka-logger.md) for more information.
+
+Create a Route to your LLM services and configure logging details as such:
+
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
+
+```shell
+curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
+  -H "X-API-KEY: ${admin_key}" \
+  -d '{
+    "id": "ai-proxy-multi-route",
+    "uri": "/anything",
+    "methods": ["POST"],
+    "plugins": {
+      "ai-proxy-multi": {
+        "instances": [
+          {
+            "name": "openai-instance",
+            "provider": "openai",
+            "weight": 8,
+            "auth": {
+              "header": {
+                "Authorization": "Bearer '"$OPENAI_API_KEY"'"
+              }
+            },
+            "options": {
+              "model": "gpt-4"
+            }
+          },
+          {
+            "name": "deepseek-instance",
+            "provider": "deepseek",
+            "weight": 2,
+            "auth": {
+              "header": {
+                "Authorization": "Bearer '"$DEEPSEEK_API_KEY"'"
+              }
+            },
+            "options": {
+              "model": "deepseek-chat"
+            }
+          }
+        ],
+        "logging": {
+          "summaries": true,
+          "payloads": true
+        }
+      },
+      "kafka-logger": {
+        "brokers": [
+          {
+            "host": "127.0.0.1",
+            "port": 9092
+          }
+        ],
+        "kafka_topic": "test2",
+        "key": "key1",
+        "batch_max_size": 1
+        }
+      }
+    }
+  }'
+```
+
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+services:
+  - name: ai-proxy-multi-service
+    routes:
+      - name: ai-proxy-multi-route
+        uris:
+          - /anything
+        methods:
+          - POST
+        plugins:
+          ai-proxy-multi:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 8
+                auth:
+                  header:
+                    Authorization: "Bearer ${OPENAI_API_KEY}"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 2
+                auth:
+                  header:
+                    Authorization: "Bearer ${DEEPSEEK_API_KEY}"
+                options:
+                  model: deepseek-chat
+            logging:
+              summaries: true
+              payloads: true
+          kafka-logger:
+            brokers:
+              - host: 127.0.0.1
+                port: 9092
+            kafka_topic: test2
+            key: key1
+            batch_max_size: 1
+```
+
+Synchronize the configuration to the gateway:
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: PluginConfig
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-plugin-config
+spec:
+  plugins:
+    - name: ai-proxy-multi
+      config:
+        instances:
+          - name: openai-instance
+            provider: openai
+            weight: 8
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: gpt-4
+          - name: deepseek-instance
+            provider: deepseek
+            weight: 2
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: deepseek-chat
+        logging:
+          summaries: true
+          payloads: true
+    - name: kafka-logger
+      config:
+        brokers:
+          - host: kafka.aic.svc.cluster.local
+            port: 9092
+        kafka_topic: test2
+        key: key1
+        batch_max_size: 1
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  parentRefs:
+    - name: apisix
+  rules:
+    - matches:
+        - path:
+            type: Exact
+            value: /anything
+          method: POST
+      filters:
+        - type: ExtensionRef
+          extensionRef:
+            group: apisix.apache.org
+            kind: PluginConfig
+            name: ai-proxy-multi-plugin-config
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  ingressClassName: apisix
+  http:
+    - name: ai-proxy-multi-route
+      match:
+        paths:
+          - /anything
+        methods:
+          - POST
+      plugins:
+        - name: ai-proxy-multi
+          enable: true
+          config:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 8
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 2
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: deepseek-chat
+            logging:
+              summaries: true
+              payloads: true
+        - name: kafka-logger
+          enable: true
+          config:
+            brokers:
+              - host: kafka.aic.svc.cluster.local
+                port: 9092
+            kafka_topic: test2
+            key: key1
+            batch_max_size: 1
+```
+
+</TabItem>
+
+</Tabs>
+
+Apply the configuration to your cluster:
+
+```shell
+kubectl apply -f ai-proxy-multi-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
+Send a POST request to the Route:
+
+```shell
+curl "http://127.0.0.1:9080/anything"; -X POST \
+  -H "Content-Type: application/json" \
+  -d '{
+    "messages": [
+      { "role": "system", "content": "You are a mathematician" },
+      { "role": "user", "content": "What is 1+1?" }
+    ]
+  }'
+```
+
+You should receive a response similar to the following if the request is 
forwarded to OpenAI:
+
+```json
+{
+  ...,
+  "model": "gpt-4-0613",
+  "choices": [
+    {
+      "index": 0,
+      "message": {
+        "role": "assistant",
+        "content": "1+1 equals 2.",
+        "refusal": null
+      },
+      "logprobs": null,
+      "finish_reason": "stop"
+    }
+  ],
+  ...
+}
+```
+
+In the Kafka topic, you should also see a log entry corresponding to the 
request with the LLM summary and request/response payload.
diff --git a/docs/zh/latest/plugins/ai-proxy-multi.md 
b/docs/zh/latest/plugins/ai-proxy-multi.md
index 9cc22ed43..a764d7d11 100644
--- a/docs/zh/latest/plugins/ai-proxy-multi.md
+++ b/docs/zh/latest/plugins/ai-proxy-multi.md
@@ -33,6 +33,9 @@ description: ai-proxy-multi 插件通过负载均衡、重试、故障转移和
   <link rel="canonical" href="https://docs.api7.ai/hub/ai-proxy-multi"; />
 </head>
 
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
 ## 描述
 
 `ai-proxy-multi` 插件通过将插件配置转换为 
OpenAI、DeepSeek、Azure、AIMLAPI、Anthropic、OpenRouter、Gemini、Vertex AI 和其他 OpenAI 
兼容 API 的指定请求格式，简化了对 LLM 和嵌入模型的访问。它通过负载均衡、重试、故障转移和健康检查扩展了 
[`ai-proxy`](./ai-proxy.md) 的功能。
@@ -68,7 +71,7 @@ description: ai-proxy-multi 插件通过负载均衡、重试、故障转移和
 | instances.auth.header               | object         | 否    |                
                   |              | 身份验证标头。应配置 `header` 和 `query` 中的至少一个。 |
 | instances.auth.query                | object         | 否    |                
                   |              | 身份验证查询参数。应配置 `header` 和 `query` 中的至少一个。 |
 | instances.auth.gcp                  | object         | 否    |                
                   |              | Google Cloud Platform (GCP) 身份验证配置。 |
-| instances.auth.gcp.service_account_json | string     | 否    |                
                   |              | GCP 服务帐户 JSON 
文件的内容。也可以通过设置“GCP_SERVICE_ACCOUNT”环境变量来配置。 |
+| instances.auth.gcp.service_account_json | string     | 否    |                
                   |              | GCP 服务帐户 JSON 
文件的内容。也可以通过设置"GCP_SERVICE_ACCOUNT"环境变量来配置。 |
 | instances.auth.gcp.max_ttl          | integer        | 否    |                
                   | minimum = 1  | 用于缓存 GCP 访问令牌的最大 TTL（以秒为单位）。 |
 | instances.auth.gcp.expire_early_secs| integer        | 否    | 60             
                   | minimum = 0  | 在访问令牌实际过期时间之前使其过期的秒数，以避免边缘情况。 |
 | instances.options                   | object         | 否    |                
                   |              | 模型配置。除了 `model` 之外，您还可以配置其他参数，它们将在请求体中转发到上游 
LLM 服务。例如，如果您使用 OpenAI、DeepSeek 或 AIMLAPI，可以配置其他参数，如 
`max_tokens`、`temperature`、`top_p` 和 `stream`。有关更多可用选项，请参阅您的 LLM 提供商的 API 文档。 |
@@ -78,26 +81,26 @@ description: ai-proxy-multi 插件通过负载均衡、重试、故障转移和
 | logging                             | object         | 否    |                
                   |              | 日志配置。不影响 `error.log`。 |
 | logging.summaries                   | boolean        | 否    | false          
                 |              | 如果为 true，记录请求 LLM 模型、持续时间、请求和响应令牌。 |
 | logging.payloads                    | boolean        | 否    | false          
                 |              | 如果为 true，记录请求和响应负载。 |
-| checks                              | object         | 否    |                
                   |              | 健康检查配置。请注意，目前 OpenAI、DeepSeek 和 AIMLAPI 
不提供官方健康检查端点。您可以在 `openai-compatible` 提供商下配置的其他 LLM 服务可能有可用的健康检查端点。 |
-| checks.active                       | object         | 是     |               
                    |              | 主动健康检查配置。 |
-| checks.active.type                  | string         | 否    | http           
                 | [http, https, tcp] | 健康检查连接类型。 |
-| checks.active.timeout               | number         | 否    | 1              
                 |              | 健康检查超时时间（秒）。 |
-| checks.active.concurrency           | integer        | 否    | 10             
                 |              | 同时检查的上游节点数量。 |
-| checks.active.host                  | string         | 否    |                
                   |              | HTTP 主机。 |
-| checks.active.port                  | integer        | 否    |                
                   | 1 到 65535（包含） | HTTP 端口。 |
-| checks.active.http_path             | string         | 否    | /              
                 |              | HTTP 探测请求的路径。 |
-| checks.active.https_verify_certificate | boolean   | 否    | true             
               |              | 如果为 true，验证节点的 TLS 证书。 |
-| checks.active.req_headers           | array[string]  | 否    |                
                   |              | 主动健康检查探测的附加请求标头。 |
-| checks.active.healthy               | object         | 否    |                
                   |              | 健康检查配置。 |
-| checks.active.healthy.interval      | integer        | 否    | 1              
                 | minimum = 1  | 检查健康节点的时间间隔（秒）。 |
-| checks.active.healthy.http_statuses | array[integer] | 否    | [200, 302]     
                 | 200 到 599   | 定义健康节点的 HTTP 状态码。 |
-| checks.active.healthy.successes     | integer        | 否    | 2              
                 | 1 到 254     | 定义健康节点所需的成功探测次数。 |
-| checks.active.unhealthy             | object         | 否    |                
                   |              | 不健康检查配置。 |
-| checks.active.unhealthy.interval    | integer        | 否    | 1              
                 | minimum = 1  | 检查不健康节点的时间间隔（秒）。 |
-| checks.active.unhealthy.http_statuses | array[integer] | 否  | [429, 404, 
500, 501, 502, 503, 504, 505] | 200 到 599 | 定义不健康节点的 HTTP 状态码。 |
-| checks.active.unhealthy.http_failures | integer      | 否    | 5              
                 | 1 到 254     | 定义不健康节点所需的 HTTP 失败次数。 |
-| checks.active.unhealthy.tcp_failures | integer       | 否    | 2              
                 | 1 到 254     | 定义不健康节点所需的 TCP 失败次数。 |
-| checks.active.unhealthy.timeouts    | integer        | 否    | 3              
                 | 1 到 254     | 定义不健康节点所需的探测超时次数。 |
+| instances.override                    | object         | 否    |              
                     |              | 覆盖设置。 |
+| instances.override.endpoint           | string         | 否    |              
                     |              | 用于替换默认端点的 LLM 提供商端点。如果未配置，插件使用默认的 OpenAI 
端点 `https://api.openai.com/v1/chat/completions`。 |
+| instances.checks                              | object         | 否    |      
                             |              | 健康检查配置。请注意，目前 OpenAI、DeepSeek 和 
AIMLAPI 不提供官方健康检查端点。您可以在 `openai-compatible` 提供商下配置的其他 LLM 服务可能有可用的健康检查端点。 |
+| instances.checks.active                       | object         | 是     |     
                              |              | 主动健康检查配置。 |
+| instances.checks.active.type                  | string         | 否    | http 
                           | [http, https, tcp] | 健康检查连接类型。 |
+| instances.checks.active.timeout               | number         | 否    | 1    
                           |              | 健康检查超时时间（秒）。 |
+| instances.checks.active.concurrency           | integer        | 否    | 10   
                           |              | 同时检查的上游节点数量。 |
+| instances.checks.active.host                  | string         | 否    |      
                             |              | HTTP 主机。 |
+| instances.checks.active.port                  | integer        | 否    |      
                             | 1 到 65535（包含） | HTTP 端口。 |
+| instances.checks.active.http_path             | string         | 否    | /    
                           |              | HTTP 探测请求的路径。 |
+| instances.checks.active.https_verify_certificate | boolean   | 否    | true   
                         |              | 如果为 true，验证节点的 TLS 证书。 |
+| instances.checks.active.healthy               | object         | 否    |      
                             |              | 健康检查配置。 |
+| instances.checks.active.healthy.interval      | integer        | 否    | 1    
                           |              | 检查健康节点的时间间隔（秒）。 |
+| instances.checks.active.healthy.http_statuses | array[integer] | 否    | 
[200,302]                       | 200 到 599 之间的状态码（包含） | 定义健康节点的 HTTP 状态码数组。 |
+| instances.checks.active.healthy.successes     | integer        | 否    | 2    
                           | 1 到 254（包含） | 定义健康节点所需的成功探测次数。 |
+| instances.checks.active.unhealthy             | object         | 否    |      
                             |              | 不健康检查配置。 |
+| instances.checks.active.unhealthy.interval    | integer        | 否    | 1    
                           |              | 检查不健康节点的时间间隔（秒）。 |
+| instances.checks.active.unhealthy.http_statuses | array[integer] | 否  | 
[429,404,500,501,502,503,504,505] | 200 到 599 之间的状态码（包含） | 定义不健康节点的 HTTP 状态码数组。 
|
+| instances.checks.active.unhealthy.http_failures | integer      | 否    | 5    
                           | 1 到 254（包含） | 定义不健康节点的 HTTP 失败次数。 |
+| instances.checks.active.unhealthy.timeout     | integer        | 否    | 3    
                           | 1 到 254（包含） | 定义不健康节点的探测超时次数。 |
 | timeout                             | integer        | 否    | 30000          
                 | 大于或等于 1 | 请求 LLM 服务时的请求超时时间（毫秒）。 |
 | keepalive                           | boolean        | 否    | true           
                 |              | 如果为 true，在请求 LLM 服务时保持连接活跃。 |
 | keepalive_timeout                   | integer        | 否    | 60000          
                 | 大于或等于 1000 | 请求 LLM 服务时的请求超时时间（毫秒）。 |
@@ -124,7 +127,18 @@ admin_key=$(yq '.deployment.admin.admin_key[0].key' 
conf/config.yaml | sed 's/"/
 
 为了演示和更容易区分，您将配置一个 OpenAI 实例和一个 DeepSeek 实例作为上游 LLM 服务。
 
-创建路由并更新您的 LLM 提供商、模型、API 密钥和端点（如果适用）：
+创建 Route 并更新您的 LLM 提供商、模型、API 密钥和端点（如果适用）：
+
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
 
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
@@ -168,7 +182,167 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   }'
 ```
 
-向路由发送 10 个 POST 请求，在请求体中包含系统提示和示例用户问题，以查看转发到 OpenAI 和 DeepSeek 的请求数量：
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+services:
+  - name: ai-proxy-multi-service
+    routes:
+      - name: ai-proxy-multi-route
+        uris:
+          - /anything
+        methods:
+          - POST
+        plugins:
+          ai-proxy-multi:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 8
+                auth:
+                  header:
+                    Authorization: "Bearer ${OPENAI_API_KEY}"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 2
+                auth:
+                  header:
+                    Authorization: "Bearer ${DEEPSEEK_API_KEY}"
+                options:
+                  model: deepseek-chat
+```
+
+将配置同步到网关：
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: PluginConfig
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-plugin-config
+spec:
+  plugins:
+    - name: ai-proxy-multi
+      config:
+        instances:
+          - name: openai-instance
+            provider: openai
+            weight: 8
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: gpt-4
+          - name: deepseek-instance
+            provider: deepseek
+            weight: 2
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: deepseek-chat
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  parentRefs:
+    - name: apisix
+  rules:
+    - matches:
+        - path:
+            type: Exact
+            value: /anything
+          method: POST
+      filters:
+        - type: ExtensionRef
+          extensionRef:
+            group: apisix.apache.org
+            kind: PluginConfig
+            name: ai-proxy-multi-plugin-config
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  ingressClassName: apisix
+  http:
+    - name: ai-proxy-multi-route
+      match:
+        paths:
+          - /anything
+        methods:
+          - POST
+      plugins:
+        - name: ai-proxy-multi
+          enable: true
+          config:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 8
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 2
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: deepseek-chat
+```
+
+</TabItem>
+
+</Tabs>
+
+将配置应用到集群：
+
+```shell
+kubectl apply -f ai-proxy-multi-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
+向 Route 发送 10 个 POST 请求，在请求体中包含系统提示和示例用户问题，以查看转发到 OpenAI 和 DeepSeek 的请求数量：
 
 ```shell
 openai_count=0
@@ -206,7 +380,18 @@ DeepSeek responses: 2
 
 以下示例演示了如何配置两个具有不同优先级的模型，并在优先级较高的实例上应用速率限制。在 `fallback_strategy` 设置为 
`["rate_limiting"]` 的情况下，一旦高优先级实例的速率限制配额完全消耗，插件应继续将请求转发到低优先级实例。
 
-创建路由并更新您的 LLM 提供商、模型、API 密钥和端点（如果适用）：
+创建 Route 并更新您的 LLM 提供商、模型、API 密钥和端点（如果适用）：
+
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
 
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
@@ -263,7 +448,200 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   }'
 ```
 
-向路由发送 POST 请求，在请求体中包含系统提示和示例用户问题：
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+services:
+  - name: ai-proxy-multi-service
+    routes:
+      - name: ai-proxy-multi-route
+        uris:
+          - /anything
+        methods:
+          - POST
+        plugins:
+          ai-proxy-multi:
+            fallback_strategy:
+              - rate_limiting
+            instances:
+              - name: openai-instance
+                provider: openai
+                priority: 1
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${OPENAI_API_KEY}"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                priority: 0
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${DEEPSEEK_API_KEY}"
+                options:
+                  model: deepseek-chat
+          ai-rate-limiting:
+            instances:
+              - name: openai-instance
+                limit: 10
+                time_window: 60
+            limit_strategy: total_tokens
+```
+
+将配置同步到网关：
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: PluginConfig
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-plugin-config
+spec:
+  plugins:
+    - name: ai-proxy-multi
+      config:
+        fallback_strategy:
+          - rate_limiting
+        instances:
+          - name: openai-instance
+            provider: openai
+            priority: 1
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: gpt-4
+          - name: deepseek-instance
+            provider: deepseek
+            priority: 0
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: deepseek-chat
+    - name: ai-rate-limiting
+      config:
+        instances:
+          - name: openai-instance
+            limit: 10
+            time_window: 60
+        limit_strategy: total_tokens
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  parentRefs:
+    - name: apisix
+  rules:
+    - matches:
+        - path:
+            type: Exact
+            value: /anything
+          method: POST
+      filters:
+        - type: ExtensionRef
+          extensionRef:
+            group: apisix.apache.org
+            kind: PluginConfig
+            name: ai-proxy-multi-plugin-config
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  ingressClassName: apisix
+  http:
+    - name: ai-proxy-multi-route
+      match:
+        paths:
+          - /anything
+        methods:
+          - POST
+      plugins:
+        - name: ai-proxy-multi
+          enable: true
+          config:
+            fallback_strategy:
+              - rate_limiting
+            instances:
+              - name: openai-instance
+                provider: openai
+                priority: 1
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                priority: 0
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: deepseek-chat
+        - name: ai-rate-limiting
+          enable: true
+          config:
+            instances:
+              - name: openai-instance
+                limit: 10
+                time_window: 60
+            limit_strategy: total_tokens
+```
+
+</TabItem>
+
+</Tabs>
+
+将配置应用到集群：
+
+```shell
+kubectl apply -f ai-proxy-multi-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
+向 Route 发送 POST 请求，在请求体中包含系统提示和示例用户问题：
 
 ```shell
 curl "http://127.0.0.1:9080/anything"; -X POST \
@@ -316,7 +694,7 @@ curl "http://127.0.0.1:9080/anything"; -X POST \
 
 由于 `total_tokens` 值超过了配置的 `10` 配额，预计在 60 秒窗口内的下一个请求将转发到另一个实例。
 
-在同一个 60 秒窗口内，向路由发送另一个 POST 请求：
+在同一个 60 秒窗口内，向 Route 发送另一个 POST 请求：
 
 ```shell
 curl "http://127.0.0.1:9080/anything"; -X POST \
@@ -347,12 +725,24 @@ curl "http://127.0.0.1:9080/anything"; -X POST \
   ],
   ...
 }
-```#
-## 按消费者进行负载均衡和速率限制
+```
+
+### 按消费者进行负载均衡和速率限制
 
 以下示例演示了如何配置两个模型进行负载均衡，并按消费者应用速率限制。
 
-创建消费者 `johndoe` 并在 `openai-instance` 实例上设置 60 秒窗口内 10 个令牌的速率限制配额：
+创建 Consumer `johndoe` 并在 `openai-instance` 实例上设置 60 秒窗口内 10 个令牌的速率限制配额：
+
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
 
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/consumers"; -X PUT \
@@ -375,7 +765,7 @@ curl "http://127.0.0.1:9180/apisix/admin/consumers"; -X PUT \
   }'
 ```
 
-为 `johndoe` 配置 `key-auth` 凭据：
+为 `johndoe` 配置 `key-auth` Credential：
 
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/consumers/johndoe/credentials"; -X PUT 
\
@@ -390,7 +780,7 @@ curl 
"http://127.0.0.1:9180/apisix/admin/consumers/johndoe/credentials"; -X PUT \
   }'
 ```
 
-创建另一个消费者 `janedoe` 并在 `deepseek-instance` 实例上设置 60 秒窗口内 10 个令牌的速率限制配额：
+创建另一个 Consumer `janedoe` 并在 `deepseek-instance` 实例上设置 60 秒窗口内 10 个令牌的速率限制配额：
 
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/consumers"; -X PUT \
@@ -413,7 +803,7 @@ curl "http://127.0.0.1:9180/apisix/admin/consumers"; -X PUT \
   }'
 ```
 
-为 `janedoe` 配置 `key-auth` 凭据：
+为 `janedoe` 配置 `key-auth` Credential：
 
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/consumers/janedoe/credentials"; -X PUT 
\
@@ -428,7 +818,182 @@ curl 
"http://127.0.0.1:9180/apisix/admin/consumers/janedoe/credentials"; -X PUT \
   }'
 ```
 
-创建路由并更新您的 LLM 提供商、模型、API 密钥和端点（如果适用）：
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+consumers:
+  - username: johndoe
+    plugins:
+      ai-rate-limiting:
+        instances:
+          - name: openai-instance
+            limit: 10
+            time_window: 60
+        rejected_code: 429
+        limit_strategy: total_tokens
+    credentials:
+      - name: key-auth
+        type: key-auth
+        config:
+          key: john-key
+  - username: janedoe
+    plugins:
+      ai-rate-limiting:
+        instances:
+          - name: deepseek-instance
+            limit: 10
+            time_window: 60
+        rejected_code: 429
+        limit_strategy: total_tokens
+    credentials:
+      - name: key-auth
+        type: key-auth
+        config:
+          key: jane-key
+```
+
+将配置同步到网关：
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-consumer-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: Consumer
+metadata:
+  namespace: aic
+  name: johndoe
+spec:
+  gatewayRef:
+    name: apisix
+  plugins:
+    - name: ai-rate-limiting
+      config:
+        instances:
+          - name: openai-instance
+            limit: 10
+            time_window: 60
+        rejected_code: 429
+        limit_strategy: total_tokens
+  credentials:
+    - type: key-auth
+      name: primary-key
+      config:
+        key: john-key
+---
+apiVersion: apisix.apache.org/v1alpha1
+kind: Consumer
+metadata:
+  namespace: aic
+  name: janedoe
+spec:
+  gatewayRef:
+    name: apisix
+  plugins:
+    - name: ai-rate-limiting
+      config:
+        instances:
+          - name: deepseek-instance
+            limit: 10
+            time_window: 60
+        rejected_code: 429
+        limit_strategy: total_tokens
+  credentials:
+    - type: key-auth
+      name: primary-key
+      config:
+        key: jane-key
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-consumer-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixConsumer
+metadata:
+  namespace: aic
+  name: johndoe
+spec:
+  ingressClassName: apisix
+  authParameter:
+    keyAuth:
+      value:
+        key: john-key
+  plugins:
+    ai-rate-limiting:
+      instances:
+        - name: openai-instance
+          limit: 10
+          time_window: 60
+      rejected_code: 429
+      limit_strategy: total_tokens
+---
+apiVersion: apisix.apache.org/v2
+kind: ApisixConsumer
+metadata:
+  namespace: aic
+  name: janedoe
+spec:
+  ingressClassName: apisix
+  authParameter:
+    keyAuth:
+      value:
+        key: jane-key
+  plugins:
+    ai-rate-limiting:
+      instances:
+        - name: deepseek-instance
+          limit: 10
+          time_window: 60
+      rejected_code: 429
+      limit_strategy: total_tokens
+```
+
+</TabItem>
+
+</Tabs>
+
+将配置应用到集群：
+
+```shell
+kubectl apply -f ai-proxy-multi-consumer-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
+创建 Route 并更新您的 LLM 提供商、模型、API 密钥和端点（如果适用）：
+
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
 
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
@@ -474,7 +1039,180 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   }'
 ```
 
-向路由发送 POST 请求，不带任何消费者密钥：
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+services:
+  - name: ai-proxy-multi-service
+    routes:
+      - name: ai-proxy-multi-route
+        uris:
+          - /anything
+        methods:
+          - POST
+        plugins:
+          key-auth: {}
+          ai-proxy-multi:
+            fallback_strategy:
+              - rate_limiting
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${OPENAI_API_KEY}"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${DEEPSEEK_API_KEY}"
+                options:
+                  model: deepseek-chat
+```
+
+将配置同步到网关：
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: PluginConfig
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-plugin-config
+spec:
+  plugins:
+    - name: key-auth
+      config:
+        _meta:
+          disable: false
+    - name: ai-proxy-multi
+      config:
+        fallback_strategy:
+          - rate_limiting
+        instances:
+          - name: openai-instance
+            provider: openai
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: gpt-4
+          - name: deepseek-instance
+            provider: deepseek
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: deepseek-chat
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  parentRefs:
+    - name: apisix
+  rules:
+    - matches:
+        - path:
+            type: Exact
+            value: /anything
+          method: POST
+      filters:
+        - type: ExtensionRef
+          extensionRef:
+            group: apisix.apache.org
+            kind: PluginConfig
+            name: ai-proxy-multi-plugin-config
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  ingressClassName: apisix
+  http:
+    - name: ai-proxy-multi-route
+      match:
+        paths:
+          - /anything
+        methods:
+          - POST
+      plugins:
+        - name: key-auth
+          enable: true
+        - name: ai-proxy-multi
+          enable: true
+          config:
+            fallback_strategy:
+              - rate_limiting
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: deepseek-chat
+```
+
+</TabItem>
+
+</Tabs>
+
+将配置应用到集群：
+
+```shell
+kubectl apply -f ai-proxy-multi-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
+向 Route 发送 POST 请求，不带任何消费者密钥：
 
 ```shell
 curl -i "http://127.0.0.1:9080/anything"; -X POST \
@@ -489,7 +1227,7 @@ curl -i "http://127.0.0.1:9080/anything"; -X POST \
 
 您应该收到 `HTTP/1.1 401 Unauthorized` 响应。
 
-使用 `johndoe` 的密钥向路由发送 POST 请求：
+使用 `johndoe` 的密钥向 Route 发送 POST 请求：
 
 ```shell
 curl "http://127.0.0.1:9080/anything"; -X POST \
@@ -543,7 +1281,7 @@ curl "http://127.0.0.1:9080/anything"; -X POST \
 
 由于 `total_tokens` 值超过了 `johndoe` 的 `openai` 实例配置配额，预计在 60 秒窗口内来自 `johndoe` 
的下一个请求将转发到 `deepseek` 实例。
 
-在同一个 60 秒窗口内，使用 `johndoe` 的密钥向路由发送另一个 POST 请求：
+在同一个 60 秒窗口内，使用 `johndoe` 的密钥向 Route 发送另一个 POST 请求：
 
 ```shell
 curl "http://127.0.0.1:9080/anything"; -X POST \
@@ -577,7 +1315,7 @@ curl "http://127.0.0.1:9080/anything"; -X POST \
 }
 ```
 
-使用 `janedoe` 的密钥向路由发送 POST 请求：
+使用 `janedoe` 的密钥向 Route 发送 POST 请求：
 
 ```shell
 curl "http://127.0.0.1:9080/anything"; -X POST \
@@ -624,7 +1362,7 @@ curl "http://127.0.0.1:9080/anything"; -X POST \
 
 由于 `total_tokens` 值超过了 `janedoe` 的 `deepseek` 实例配置配额，预计在 60 秒窗口内来自 `janedoe` 
的下一个请求将转发到 `openai` 实例。
 
-在同一个 60 秒窗口内，使用 `janedoe` 的密钥向路由发送另一个 POST 请求：
+在同一个 60 秒窗口内，使用 `janedoe` 的密钥向 Route 发送另一个 POST 请求：
 
 ```shell
 curl "http://127.0.0.1:9080/anything"; -X POST \
@@ -660,7 +1398,7 @@ curl "http://127.0.0.1:9080/anything"; -X POST \
 }
 ```
 
-这显示了 `ai-proxy-multi` 根据消费者在 `ai-rate-limiting` 中的速率限制规则对流量进行负载均衡。
+这显示了 `ai-proxy-multi` 根据 Consumer 在 `ai-rate-limiting` 中的速率限制规则对流量进行负载均衡。
 
 ### 限制完成令牌的最大数量
 
@@ -668,7 +1406,18 @@ curl "http://127.0.0.1:9080/anything"; -X POST \
 
 为了演示和更容易区分，您将配置一个 OpenAI 实例和一个 DeepSeek 实例作为上游 LLM 服务。
 
-创建路由并更新您的 LLM 提供商、模型、API 密钥和端点（如果适用）：
+创建 Route 并更新您的 LLM 提供商、模型、API 密钥和端点（如果适用）：
+
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
 
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
@@ -714,7 +1463,173 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   }'
 ```
 
-向路由发送 POST 请求，在请求体中包含系统提示和示例用户问题：
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+services:
+  - name: ai-proxy-multi-service
+    routes:
+      - name: ai-proxy-multi-route
+        uris:
+          - /anything
+        methods:
+          - POST
+        plugins:
+          ai-proxy-multi:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${OPENAI_API_KEY}"
+                options:
+                  model: gpt-4
+                  max_tokens: 50
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${DEEPSEEK_API_KEY}"
+                options:
+                  model: deepseek-chat
+                  max_tokens: 100
+```
+
+将配置同步到网关：
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: PluginConfig
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-plugin-config
+spec:
+  plugins:
+    - name: ai-proxy-multi
+      config:
+        instances:
+          - name: openai-instance
+            provider: openai
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: gpt-4
+              max_tokens: 50
+          - name: deepseek-instance
+            provider: deepseek
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: deepseek-chat
+              max_tokens: 100
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  parentRefs:
+    - name: apisix
+  rules:
+    - matches:
+        - path:
+            type: Exact
+            value: /anything
+          method: POST
+      filters:
+        - type: ExtensionRef
+          extensionRef:
+            group: apisix.apache.org
+            kind: PluginConfig
+            name: ai-proxy-multi-plugin-config
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  ingressClassName: apisix
+  http:
+    - name: ai-proxy-multi-route
+      match:
+        paths:
+          - /anything
+        methods:
+          - POST
+      plugins:
+        - name: ai-proxy-multi
+          enable: true
+          config:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: gpt-4
+                  max_tokens: 50
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: deepseek-chat
+                  max_tokens: 100
+```
+
+</TabItem>
+
+</Tabs>
+
+将配置应用到集群：
+
+```shell
+kubectl apply -f ai-proxy-multi-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
+向 Route 发送 POST 请求，在请求体中包含系统提示和示例用户问题：
 
 ```shell
 curl "http://127.0.0.1:9080/anything"; -X POST \
@@ -800,7 +1715,18 @@ curl "http://127.0.0.1:9080/anything"; -X POST \
 
 以下示例演示了如何配置 `ai-proxy-multi` 插件以代理请求并在嵌入模型之间进行负载均衡。
 
-创建路由并更新您的 LLM 提供商、嵌入模型、API 密钥和端点：
+创建 Route 并更新您的 LLM 提供商、嵌入模型、API 密钥和端点：
+
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
 
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
@@ -850,7 +1776,179 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   }'
 ```
 
-向路由发送 POST 请求，包含输入字符串：
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+services:
+  - name: ai-proxy-multi-service
+    routes:
+      - name: ai-proxy-multi-route
+        uris:
+          - /anything
+        methods:
+          - POST
+        plugins:
+          ai-proxy-multi:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${OPENAI_API_KEY}"
+                options:
+                  model: text-embedding-3-small
+                override:
+                  endpoint: "https://api.openai.com/v1/embeddings";
+              - name: az-openai-instance
+                provider: openai-compatible
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${AZ_OPENAI_API_KEY}"
+                options:
+                  model: text-embedding-3-small
+                override:
+                  endpoint: 
"https://ai-plugin-developer.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15";
+```
+
+将配置同步到网关：
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: PluginConfig
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-plugin-config
+spec:
+  plugins:
+    - name: ai-proxy-multi
+      config:
+        instances:
+          - name: openai-instance
+            provider: openai
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: text-embedding-3-small
+            override:
+              endpoint: "https://api.openai.com/v1/embeddings";
+          - name: az-openai-instance
+            provider: openai-compatible
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: text-embedding-3-small
+            override:
+              endpoint: 
"https://ai-plugin-developer.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15";
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  parentRefs:
+    - name: apisix
+  rules:
+    - matches:
+        - path:
+            type: Exact
+            value: /anything
+          method: POST
+      filters:
+        - type: ExtensionRef
+          extensionRef:
+            group: apisix.apache.org
+            kind: PluginConfig
+            name: ai-proxy-multi-plugin-config
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  ingressClassName: apisix
+  http:
+    - name: ai-proxy-multi-route
+      match:
+        paths:
+          - /anything
+        methods:
+          - POST
+      plugins:
+        - name: ai-proxy-multi
+          enable: true
+          config:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: text-embedding-3-small
+                override:
+                  endpoint: "https://api.openai.com/v1/embeddings";
+              - name: az-openai-instance
+                provider: openai-compatible
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: text-embedding-3-small
+                override:
+                  endpoint: 
"https://ai-plugin-developer.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15";
+```
+
+</TabItem>
+
+</Tabs>
+
+将配置应用到集群：
+
+```shell
+kubectl apply -f ai-proxy-multi-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
+向 Route 发送 POST 请求，包含输入字符串：
 
 ```shell
 curl "http://127.0.0.1:9080/embeddings"; -X POST \
@@ -892,7 +1990,18 @@ curl "http://127.0.0.1:9080/embeddings"; -X POST \
 
 以下示例演示了如何配置 `ai-proxy-multi` 
插件以代理请求并在模型之间进行负载均衡，并启用主动健康检查以提高服务可用性。您可以在一个或多个实例上启用健康检查。
 
-创建路由并更新 LLM 提供商、嵌入模型、API 密钥和健康检查相关配置：
+创建 Route 并更新 LLM 提供商、嵌入模型、API 密钥和健康检查相关配置：
+
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
 
 ```shell
 curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
@@ -951,8 +2060,506 @@ curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
   }'
 ```
 
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+services:
+  - name: ai-proxy-multi-service
+    routes:
+      - name: ai-proxy-multi-route
+        uris:
+          - /anything
+        methods:
+          - POST
+        plugins:
+          ai-proxy-multi:
+            instances:
+              - name: llm-instance-1
+                provider: openai-compatible
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${YOUR_LLM_API_KEY}"
+                options:
+                  model: "${YOUR_LLM_MODEL}"
+              - name: llm-instance-2
+                provider: openai-compatible
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer ${YOUR_LLM_API_KEY}"
+                options:
+                  model: "${YOUR_LLM_MODEL}"
+                checks:
+                  active:
+                    type: https
+                    host: yourhost.com
+                    http_path: /your/probe/path
+                    healthy:
+                      interval: 2
+                      successes: 1
+                    unhealthy:
+                      interval: 1
+                      http_failures: 3
+```
+
+将配置同步到网关：
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: PluginConfig
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-plugin-config
+spec:
+  plugins:
+    - name: ai-proxy-multi
+      config:
+        instances:
+          - name: llm-instance-1
+            provider: openai-compatible
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: your-model
+          - name: llm-instance-2
+            provider: openai-compatible
+            weight: 0
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: your-model
+            checks:
+              active:
+                type: https
+                host: yourhost.com
+                http_path: /your/probe/path
+                healthy:
+                  interval: 2
+                  successes: 1
+                unhealthy:
+                  interval: 1
+                  http_failures: 3
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  parentRefs:
+    - name: apisix
+  rules:
+    - matches:
+        - path:
+            type: Exact
+            value: /anything
+          method: POST
+      filters:
+        - type: ExtensionRef
+          extensionRef:
+            group: apisix.apache.org
+            kind: PluginConfig
+            name: ai-proxy-multi-plugin-config
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  ingressClassName: apisix
+  http:
+    - name: ai-proxy-multi-route
+      match:
+        paths:
+          - /anything
+        methods:
+          - POST
+      plugins:
+        - name: ai-proxy-multi
+          enable: true
+          config:
+            instances:
+              - name: llm-instance-1
+                provider: openai-compatible
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: your-model
+              - name: llm-instance-2
+                provider: openai-compatible
+                weight: 0
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: your-model
+                checks:
+                  active:
+                    type: https
+                    host: yourhost.com
+                    http_path: /your/probe/path
+                    healthy:
+                      interval: 2
+                      successes: 1
+                    unhealthy:
+                      interval: 1
+                      http_failures: 3
+```
+
+</TabItem>
+
+</Tabs>
+
+将配置应用到集群：
+
+```shell
+kubectl apply -f ai-proxy-multi-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
 为了验证，行为应与[主动健康检查](../tutorials/health-check.md)中的验证一致。
 
+### 发送请求日志到日志记录器
+
+以下示例演示了如何记录请求和响应信息（包括 LLM 模型、令牌和负载），并将其推送到日志记录器。在继续之前，您应该先设置一个日志记录器，例如 
Kafka。有关更多信息，请参阅 [`kafka-logger`](./kafka-logger.md)。
+
+创建 Route 到您的 LLM 服务并配置日志记录详情：
+
+<Tabs
+groupId="api"
+defaultValue="admin-api"
+values={[
+{label: 'Admin API', value: 'admin-api'},
+{label: 'ADC', value: 'adc'},
+{label: 'Ingress Controller', value: 'aic'}
+]}>
+
+<TabItem value="admin-api">
+
+```shell
+curl "http://127.0.0.1:9180/apisix/admin/routes"; -X PUT \
+  -H "X-API-KEY: ${admin_key}" \
+  -d '{
+    "id": "ai-proxy-multi-route",
+    "uri": "/anything",
+    "methods": ["POST"],
+    "plugins": {
+      "ai-proxy-multi": {
+        "instances": [
+          {
+            "name": "openai-instance",
+            "provider": "openai",
+            "weight": 8,
+            "auth": {
+              "header": {
+                "Authorization": "Bearer '"$OPENAI_API_KEY"'"
+              }
+            },
+            "options": {
+              "model": "gpt-4"
+            }
+          },
+          {
+            "name": "deepseek-instance",
+            "provider": "deepseek",
+            "weight": 2,
+            "auth": {
+              "header": {
+                "Authorization": "Bearer '"$DEEPSEEK_API_KEY"'"
+              }
+            },
+            "options": {
+              "model": "deepseek-chat"
+            }
+          }
+        ],
+        "logging": {
+          "summaries": true,
+          "payloads": true
+        }
+      },
+      "kafka-logger": {
+        "brokers": [
+          {
+            "host": "127.0.0.1",
+            "port": 9092
+          }
+        ],
+        "kafka_topic": "test2",
+        "key": "key1",
+        "batch_max_size": 1
+        }
+      }
+    }
+  }'
+```
+
+</TabItem>
+
+<TabItem value="adc">
+
+```yaml title="adc.yaml"
+services:
+  - name: ai-proxy-multi-service
+    routes:
+      - name: ai-proxy-multi-route
+        uris:
+          - /anything
+        methods:
+          - POST
+        plugins:
+          ai-proxy-multi:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 8
+                auth:
+                  header:
+                    Authorization: "Bearer ${OPENAI_API_KEY}"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 2
+                auth:
+                  header:
+                    Authorization: "Bearer ${DEEPSEEK_API_KEY}"
+                options:
+                  model: deepseek-chat
+            logging:
+              summaries: true
+              payloads: true
+          kafka-logger:
+            brokers:
+              - host: 127.0.0.1
+                port: 9092
+            kafka_topic: test2
+            key: key1
+            batch_max_size: 1
+```
+
+将配置同步到网关：
+
+```shell
+adc sync -f adc.yaml
+```
+
+</TabItem>
+
+<TabItem value="aic">
+
+<Tabs
+groupId="k8s-api"
+defaultValue="gateway-api"
+values={[
+{label: 'Gateway API', value: 'gateway-api'},
+{label: 'APISIX CRD', value: 'apisix-crd'}
+]}>
+
+<TabItem value="gateway-api">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v1alpha1
+kind: PluginConfig
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-plugin-config
+spec:
+  plugins:
+    - name: ai-proxy-multi
+      config:
+        instances:
+          - name: openai-instance
+            provider: openai
+            weight: 8
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: gpt-4
+          - name: deepseek-instance
+            provider: deepseek
+            weight: 2
+            auth:
+              header:
+                Authorization: "Bearer your-api-key"
+            options:
+              model: deepseek-chat
+        logging:
+          summaries: true
+          payloads: true
+    - name: kafka-logger
+      config:
+        brokers:
+          - host: kafka.aic.svc.cluster.local
+            port: 9092
+        kafka_topic: test2
+        key: key1
+        batch_max_size: 1
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  parentRefs:
+    - name: apisix
+  rules:
+    - matches:
+        - path:
+            type: Exact
+            value: /anything
+          method: POST
+      filters:
+        - type: ExtensionRef
+          extensionRef:
+            group: apisix.apache.org
+            kind: PluginConfig
+            name: ai-proxy-multi-plugin-config
+```
+
+</TabItem>
+
+<TabItem value="apisix-crd">
+
+```yaml title="ai-proxy-multi-ic.yaml"
+apiVersion: apisix.apache.org/v2
+kind: ApisixRoute
+metadata:
+  namespace: aic
+  name: ai-proxy-multi-route
+spec:
+  ingressClassName: apisix
+  http:
+    - name: ai-proxy-multi-route
+      match:
+        paths:
+          - /anything
+        methods:
+          - POST
+      plugins:
+        - name: ai-proxy-multi
+          enable: true
+          config:
+            instances:
+              - name: openai-instance
+                provider: openai
+                weight: 8
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: gpt-4
+              - name: deepseek-instance
+                provider: deepseek
+                weight: 2
+                auth:
+                  header:
+                    Authorization: "Bearer your-api-key"
+                options:
+                  model: deepseek-chat
+            logging:
+              summaries: true
+              payloads: true
+        - name: kafka-logger
+          enable: true
+          config:
+            brokers:
+              - host: kafka.aic.svc.cluster.local
+                port: 9092
+            kafka_topic: test2
+            key: key1
+            batch_max_size: 1
+```
+
+</TabItem>
+
+</Tabs>
+
+将配置应用到集群：
+
+```shell
+kubectl apply -f ai-proxy-multi-ic.yaml
+```
+
+</TabItem>
+
+</Tabs>
+
+向 Route 发送 POST 请求：
+
+```shell
+curl "http://127.0.0.1:9080/anything"; -X POST \
+  -H "Content-Type: application/json" \
+  -d '{
+    "messages": [
+      { "role": "system", "content": "You are a mathematician" },
+      { "role": "user", "content": "What is 1+1?" }
+    ]
+  }'
+```
+
+如果请求被转发到 OpenAI，您应该收到类似以下的响应：
+
+```json
+{
+  ...,
+  "model": "gpt-4-0613",
+  "choices": [
+    {
+      "index": 0,
+      "message": {
+        "role": "assistant",
+        "content": "1+1 equals 2.",
+        "refusal": null
+      },
+      "logprobs": null,
+      "finish_reason": "stop"
+    }
+  ],
+  ...
+}
+```
+
+在 Kafka 主题中，您还应该看到与请求对应的日志条目，其中包含 LLM 摘要和请求/响应负载。
+
 ### 在访问日志中包含 LLM 信息
 
 以下示例演示了如何在网关的访问日志中记录 LLM 请求相关信息，以改进分析和审计。以下变量可用：
@@ -975,7 +2582,7 @@ nginx_config:
 
 重新加载 APISIX 以使配置更改生效。
 
-接下来，使用 `ai-proxy-multi` 插件创建路由并发送请求。例如，如果请求转发到 OpenAI 并且您收到以下响应：
+接下来，使用 `ai-proxy-multi` 插件创建 Route 并发送请求。例如，如果请求转发到 OpenAI 并且您收到以下响应：
 
 ```json
 {

(apisix) branch master updated: docs: fix ai-proxy-multi attribute nesting and add missing health check sub-attributes (#13169)

Reply via email to