Copilot commented on code in PR #2028: URL: https://github.com/apache/apisix-website/pull/2028#discussion_r3076940162
########## blog/en/blog/2026/04/14/apisix-3.16-dynamic-rate-limiting.md: ########## @@ -0,0 +1,328 @@ +--- +title: "What's New in Apache APISIX 3.16: Dynamic Rate Limiting for Your API Gateway" +authors: + - name: "Ming Wen" + title: "Author" + url: "https://github.com/moonming" + image_url: "https://github.com/moonming.png" +keywords: + - Apache APISIX + - API Gateway + - Rate Limiting + - Dynamic Rate Limiting + - AI Gateway + - Multi-Tenant + - Token Budget +description: Apache APISIX 3.16 introduces dynamic rate limiting with multiple rules and variable support across limit-count, limit-conn, and ai-rate-limiting plugins, enabling context-aware, per-tier, and multi-tenant rate limiting in a single route configuration. +tags: [Products] +--- + +Rate limiting is one of the most critical capabilities in any API gateway. Yet for years, most gateways — including APISIX — have treated it as a static, one-size-fits-all configuration: set a number, set a time window, done. + +In practice, real-world rate limiting is far more nuanced. A SaaS platform needs different quotas for free and paid users. An AI gateway must enforce token budgets that vary by model and consumer. A multi-tenant API must isolate rate limits per tenant without duplicating routes. + +Apache APISIX 3.16 addresses these challenges head-on with two powerful enhancements to the rate limiting plugins: **multiple rules** and **variable support**. Together, they transform rate limiting from static configuration into a dynamic, context-aware policy engine. + +<!--truncate--> + +## What Changed in APISIX 3.16 + +APISIX 3.16 introduces two complementary features across the `limit-count`, `limit-conn`, and `ai-rate-limiting` plugins: + +| Feature | Description | Supported Plugins | +|---------|-------------|-------------------| +| Multiple rules | Define an array of rate limiting rules with independent thresholds and time windows | `limit-count`, `limit-conn`, `ai-rate-limiting` | +| Variable support | Use APISIX variables (`${remote_addr}`, `${http_*}`, `${consumer_name}`, etc.) in `count`, `time_window`, and `key` fields, with optional default values via `${var ?? default}` | `limit-count`, `limit-conn`, `ai-rate-limiting` | + +Both features are fully backward compatible. Existing configurations continue to work without modification. + +## Multiple Rules: Beyond Single-Threshold Rate Limiting + +### The Problem + +Consider a common requirement: limit an API to **10 requests per second** and **500 requests per minute**. Before 3.16, you had to configure two separate plugin instances or chain multiple routes. This was verbose, error-prone, and hard to maintain. + +### The Solution + +The new `rules` array lets you define multiple rate limiting policies in a single plugin configuration. Each rule operates independently with its own counter, time window, and key. + +```json +{ + "uri": "/api/v1/*", + "plugins": { + "limit-count": { + "rules": [ + { + "count": 10, + "time_window": 1, + "key": "${remote_addr}_per_second", + "header_prefix": "per-second" + }, + { + "count": 500, + "time_window": 60, + "key": "${remote_addr}_per_minute", + "header_prefix": "per-minute" + }, + { + "count": 10000, + "time_window": 86400, + "key": "${remote_addr}_per_day", + "header_prefix": "per-day" + } + ], + "rejected_code": 429 + } + }, + "upstream": { + "type": "roundrobin", + "nodes": { + "127.0.0.1:1980": 1 + } + } +} +``` + +With this configuration, APISIX enforces all three limits simultaneously. A client hitting the per-second limit receives a `429` response with headers indicating which limit was exceeded: + +``` +X-Per-Second-RateLimit-Limit: 10 +X-Per-Second-RateLimit-Remaining: 0 +X-Per-Second-RateLimit-Reset: 1 +X-Per-Minute-RateLimit-Limit: 500 +X-Per-Minute-RateLimit-Remaining: 499 +X-Per-Minute-RateLimit-Reset: 60 +``` + +The `header_prefix` field lets clients distinguish which rule triggered the rejection — critical for debugging and client-side retry logic. + +## Variable Support: Context-Aware Rate Limiting + +### The Problem + +Static rate limits assume every consumer is equal. In reality, a free-tier user and an enterprise customer should have very different quotas. Before 3.16, supporting this meant creating separate routes for each tier — leading to route explosion and configuration drift. + +### The Solution + +Variable support lets you pull rate limiting parameters directly from the request context. The `count`, `time_window`, and `key` fields now accept APISIX variables. + +### Example 1: Per-Tier Rate Limiting via HTTP Header + +Suppose your authentication middleware injects an `X-Rate-Quota` header based on the user's subscription tier: + +```json +{ + "uri": "/api/v1/*", + "plugins": { + "limit-count": { + "rules": [ + { + "count": "${http_x_rate_quota ?? 100}", + "time_window": 60, + "key": "${consumer_name}" + } Review Comment: The default-value syntax `${http_x_rate_quota ?? 100}` is introduced here but isn’t explained, and it doesn’t match the variable form used in the APISIX 3.16.0 release post (`$http_x_custom_header`). To avoid confusion, consider using the officially documented default-value mechanism for resolved variables (or explicitly explaining what `${... ?? ...}` expands to and linking to the upstream PR/doc) so readers don’t copy an expression that APISIX won’t parse. ########## blog/en/blog/2026/04/14/apisix-3.16-dynamic-rate-limiting.md: ########## @@ -0,0 +1,328 @@ +--- +title: "What's New in Apache APISIX 3.16: Dynamic Rate Limiting for Your API Gateway" +authors: + - name: "Ming Wen" + title: "Author" + url: "https://github.com/moonming" + image_url: "https://github.com/moonming.png" +keywords: + - Apache APISIX + - API Gateway + - Rate Limiting + - Dynamic Rate Limiting + - AI Gateway + - Multi-Tenant + - Token Budget +description: Apache APISIX 3.16 introduces dynamic rate limiting with multiple rules and variable support across limit-count, limit-conn, and ai-rate-limiting plugins, enabling context-aware, per-tier, and multi-tenant rate limiting in a single route configuration. +tags: [Products] +--- + +Rate limiting is one of the most critical capabilities in any API gateway. Yet for years, most gateways — including APISIX — have treated it as a static, one-size-fits-all configuration: set a number, set a time window, done. + +In practice, real-world rate limiting is far more nuanced. A SaaS platform needs different quotas for free and paid users. An AI gateway must enforce token budgets that vary by model and consumer. A multi-tenant API must isolate rate limits per tenant without duplicating routes. + +Apache APISIX 3.16 addresses these challenges head-on with two powerful enhancements to the rate limiting plugins: **multiple rules** and **variable support**. Together, they transform rate limiting from static configuration into a dynamic, context-aware policy engine. + +<!--truncate--> + +## What Changed in APISIX 3.16 + +APISIX 3.16 introduces two complementary features across the `limit-count`, `limit-conn`, and `ai-rate-limiting` plugins: + +| Feature | Description | Supported Plugins | +|---------|-------------|-------------------| +| Multiple rules | Define an array of rate limiting rules with independent thresholds and time windows | `limit-count`, `limit-conn`, `ai-rate-limiting` | +| Variable support | Use APISIX variables (`${remote_addr}`, `${http_*}`, `${consumer_name}`, etc.) in `count`, `time_window`, and `key` fields, with optional default values via `${var ?? default}` | `limit-count`, `limit-conn`, `ai-rate-limiting` | Review Comment: The variable syntax and field list in the “Variable support” row looks inconsistent with other APISIX 3.16 content in this repo and with the `limit-conn` schema used later in this post. For example, the 3.16.0 release post describes variables as `$http_x_custom_header` (not `${...}`), and `limit-conn` uses `conn`/`burst` rather than `count`/`time_window`. Suggest updating this row (and related prose) to use the same variable notation as the release post and describe support in terms of “key and rate fields” (or plugin-specific fields) to avoid readers copying a config that won’t work. ```suggestion | Variable support | Use APISIX variables (`$remote_addr`, `$http_*`, `$consumer_name`, etc.) in `key` and plugin-specific rate or threshold fields | `limit-count`, `limit-conn`, `ai-rate-limiting` | ``` ########## blog/en/blog/2026/04/14/apisix-3.16-dynamic-rate-limiting.md: ########## @@ -0,0 +1,328 @@ +--- +title: "What's New in Apache APISIX 3.16: Dynamic Rate Limiting for Your API Gateway" +authors: + - name: "Ming Wen" + title: "Author" + url: "https://github.com/moonming" + image_url: "https://github.com/moonming.png" +keywords: + - Apache APISIX + - API Gateway + - Rate Limiting + - Dynamic Rate Limiting + - AI Gateway + - Multi-Tenant + - Token Budget +description: Apache APISIX 3.16 introduces dynamic rate limiting with multiple rules and variable support across limit-count, limit-conn, and ai-rate-limiting plugins, enabling context-aware, per-tier, and multi-tenant rate limiting in a single route configuration. +tags: [Products] +--- + +Rate limiting is one of the most critical capabilities in any API gateway. Yet for years, most gateways — including APISIX — have treated it as a static, one-size-fits-all configuration: set a number, set a time window, done. + +In practice, real-world rate limiting is far more nuanced. A SaaS platform needs different quotas for free and paid users. An AI gateway must enforce token budgets that vary by model and consumer. A multi-tenant API must isolate rate limits per tenant without duplicating routes. + +Apache APISIX 3.16 addresses these challenges head-on with two powerful enhancements to the rate limiting plugins: **multiple rules** and **variable support**. Together, they transform rate limiting from static configuration into a dynamic, context-aware policy engine. + +<!--truncate--> + +## What Changed in APISIX 3.16 + +APISIX 3.16 introduces two complementary features across the `limit-count`, `limit-conn`, and `ai-rate-limiting` plugins: + +| Feature | Description | Supported Plugins | +|---------|-------------|-------------------| +| Multiple rules | Define an array of rate limiting rules with independent thresholds and time windows | `limit-count`, `limit-conn`, `ai-rate-limiting` | +| Variable support | Use APISIX variables (`${remote_addr}`, `${http_*}`, `${consumer_name}`, etc.) in `count`, `time_window`, and `key` fields, with optional default values via `${var ?? default}` | `limit-count`, `limit-conn`, `ai-rate-limiting` | + +Both features are fully backward compatible. Existing configurations continue to work without modification. + +## Multiple Rules: Beyond Single-Threshold Rate Limiting + +### The Problem + +Consider a common requirement: limit an API to **10 requests per second** and **500 requests per minute**. Before 3.16, you had to configure two separate plugin instances or chain multiple routes. This was verbose, error-prone, and hard to maintain. + +### The Solution + +The new `rules` array lets you define multiple rate limiting policies in a single plugin configuration. Each rule operates independently with its own counter, time window, and key. + +```json +{ + "uri": "/api/v1/*", + "plugins": { + "limit-count": { + "rules": [ + { + "count": 10, + "time_window": 1, + "key": "${remote_addr}_per_second", + "header_prefix": "per-second" + }, + { + "count": 500, + "time_window": 60, + "key": "${remote_addr}_per_minute", + "header_prefix": "per-minute" + }, + { + "count": 10000, + "time_window": 86400, + "key": "${remote_addr}_per_day", + "header_prefix": "per-day" + } + ], + "rejected_code": 429 + } + }, + "upstream": { + "type": "roundrobin", + "nodes": { + "127.0.0.1:1980": 1 + } + } +} +``` + +With this configuration, APISIX enforces all three limits simultaneously. A client hitting the per-second limit receives a `429` response with headers indicating which limit was exceeded: + +``` +X-Per-Second-RateLimit-Limit: 10 +X-Per-Second-RateLimit-Remaining: 0 +X-Per-Second-RateLimit-Reset: 1 +X-Per-Minute-RateLimit-Limit: 500 +X-Per-Minute-RateLimit-Remaining: 499 +X-Per-Minute-RateLimit-Reset: 60 +``` + +The `header_prefix` field lets clients distinguish which rule triggered the rejection — critical for debugging and client-side retry logic. + +## Variable Support: Context-Aware Rate Limiting + +### The Problem + +Static rate limits assume every consumer is equal. In reality, a free-tier user and an enterprise customer should have very different quotas. Before 3.16, supporting this meant creating separate routes for each tier — leading to route explosion and configuration drift. + +### The Solution + +Variable support lets you pull rate limiting parameters directly from the request context. The `count`, `time_window`, and `key` fields now accept APISIX variables. + +### Example 1: Per-Tier Rate Limiting via HTTP Header + +Suppose your authentication middleware injects an `X-Rate-Quota` header based on the user's subscription tier: + +```json +{ + "uri": "/api/v1/*", + "plugins": { + "limit-count": { + "rules": [ + { + "count": "${http_x_rate_quota ?? 100}", + "time_window": 60, + "key": "${consumer_name}" + } + ], + "rejected_code": 429 + } + }, + "upstream": { + "type": "roundrobin", + "nodes": { + "127.0.0.1:1980": 1 + } + } +} +``` + +Now the same route handles all tiers: + +| Tier | `X-Rate-Quota` Header | Effective Limit | +|------|----------------------|-----------------| +| Free | 100 | 100 req/min | +| Pro | 1000 | 1,000 req/min | +| Enterprise | 50000 | 50,000 req/min | + +One route. One plugin configuration. All tiers. + +### Example 2: Multi-Tenant Isolation with Variable Combination + +For a multi-tenant SaaS API, you can combine variables to create isolated rate limit buckets per tenant per endpoint: + +```json +{ + "uri": "/api/v1/*", + "plugins": { + "limit-count": { + "rules": [ + { + "count": 1000, + "time_window": 60, + "key": "${http_x_tenant_id} ${uri}" + } + ], + "rejected_code": 429 + } + }, + "upstream": { + "type": "roundrobin", + "nodes": { + "127.0.0.1:1980": 1 + } + } +} +``` + +Tenant A calling `/api/v1/users` and Tenant B calling the same endpoint get independent counters. Tenant A calling `/api/v1/orders` gets yet another counter. This creates a natural per-tenant-per-endpoint isolation without any route duplication. + +### Example 3: Dynamic Concurrent Connection Limits + +The `limit-conn` plugin also supports rules and variables, enabling dynamic concurrency control: + +```json +{ + "uri": "/api/v1/inference", + "plugins": { + "limit-conn": { + "default_conn_delay": 0.1, + "rules": [ + { + "conn": 5, + "burst": 2, + "key": "${consumer_name}" + }, + { + "conn": 100, + "burst": 20, + "key": "global" + } + ], + "rejected_code": 503 + } + }, + "upstream": { + "type": "roundrobin", + "nodes": { + "127.0.0.1:1980": 1 + } + } +} +``` + +This limits each consumer to 5 concurrent connections while capping the total at 100 — preventing any single consumer from monopolizing backend capacity. + +## AI Rate Limiting: Token Budget Management + +For AI gateway use cases, the `ai-rate-limiting` plugin combines multiple rules with variable support for fine-grained token budget control: + +```json +{ + "uri": "/v1/chat/completions", + "plugins": { + "ai-rate-limiting": { + "limit_strategy": "total_tokens", + "rules": [ + { + "count": 10000, + "time_window": 60, + "key": "${consumer_name}_per_minute", + "header_prefix": "consumer" + }, + { + "count": 500000, + "time_window": 86400, + "key": "${consumer_name}_per_day", + "header_prefix": "daily" + }, + { + "count": 1000000, + "time_window": 60, + "key": "global", + "header_prefix": "global" + } + ], + "rejected_code": 429 + } + }, + "upstream": { + "type": "roundrobin", + "nodes": { + "127.0.0.1:1980": 1 + } + } +} +``` + +This configuration enforces three simultaneous constraints: + +1. **Per-consumer burst**: 10,000 tokens per minute per consumer Review Comment: In the constraints list, “Per-consumer burst” is a bit misleading terminology for a fixed 60s token budget (and could be confused with the `burst` concept in other rate limiting/concurrency plugins). Suggest renaming it to something like “Per-consumer per-minute budget/limit” for accuracy and clarity. ```suggestion 1. **Per-consumer per-minute budget**: 10,000 tokens per minute per consumer ``` ########## blog/en/blog/2026/04/14/apisix-3.16-dynamic-rate-limiting.md: ########## @@ -0,0 +1,328 @@ +--- +title: "What's New in Apache APISIX 3.16: Dynamic Rate Limiting for Your API Gateway" +authors: + - name: "Ming Wen" + title: "Author" + url: "https://github.com/moonming" + image_url: "https://github.com/moonming.png" +keywords: + - Apache APISIX + - API Gateway + - Rate Limiting + - Dynamic Rate Limiting + - AI Gateway + - Multi-Tenant + - Token Budget +description: Apache APISIX 3.16 introduces dynamic rate limiting with multiple rules and variable support across limit-count, limit-conn, and ai-rate-limiting plugins, enabling context-aware, per-tier, and multi-tenant rate limiting in a single route configuration. +tags: [Products] +--- + +Rate limiting is one of the most critical capabilities in any API gateway. Yet for years, most gateways — including APISIX — have treated it as a static, one-size-fits-all configuration: set a number, set a time window, done. + +In practice, real-world rate limiting is far more nuanced. A SaaS platform needs different quotas for free and paid users. An AI gateway must enforce token budgets that vary by model and consumer. A multi-tenant API must isolate rate limits per tenant without duplicating routes. + +Apache APISIX 3.16 addresses these challenges head-on with two powerful enhancements to the rate limiting plugins: **multiple rules** and **variable support**. Together, they transform rate limiting from static configuration into a dynamic, context-aware policy engine. + +<!--truncate--> + +## What Changed in APISIX 3.16 + +APISIX 3.16 introduces two complementary features across the `limit-count`, `limit-conn`, and `ai-rate-limiting` plugins: + +| Feature | Description | Supported Plugins | +|---------|-------------|-------------------| +| Multiple rules | Define an array of rate limiting rules with independent thresholds and time windows | `limit-count`, `limit-conn`, `ai-rate-limiting` | +| Variable support | Use APISIX variables (`${remote_addr}`, `${http_*}`, `${consumer_name}`, etc.) in `count`, `time_window`, and `key` fields, with optional default values via `${var ?? default}` | `limit-count`, `limit-conn`, `ai-rate-limiting` | + +Both features are fully backward compatible. Existing configurations continue to work without modification. + +## Multiple Rules: Beyond Single-Threshold Rate Limiting + +### The Problem + +Consider a common requirement: limit an API to **10 requests per second** and **500 requests per minute**. Before 3.16, you had to configure two separate plugin instances or chain multiple routes. This was verbose, error-prone, and hard to maintain. + +### The Solution + +The new `rules` array lets you define multiple rate limiting policies in a single plugin configuration. Each rule operates independently with its own counter, time window, and key. + +```json +{ + "uri": "/api/v1/*", + "plugins": { + "limit-count": { + "rules": [ + { + "count": 10, + "time_window": 1, + "key": "${remote_addr}_per_second", + "header_prefix": "per-second" + }, Review Comment: These examples use `${remote_addr}` embedded inside a larger string (e.g., `${remote_addr}_per_second`). Elsewhere in this repo (e.g., the APISIX 3.16.0 release post) variables are shown as `$remote_addr` / `$http_*` without `${}` templating, and it’s unclear whether concatenation inside the `key` field is supported or whether `key` expects a single variable name/value. Please double-check the exact syntax APISIX 3.16 supports here and adjust the examples (and/or add a short clarification) so readers can copy/paste working configs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
