iakuf opened a new pull request, #13049:
URL: https://github.com/apache/apisix/pull/13049
# PR: feat(ai-rate-limiting): add `standard_headers` option for
OpenAI/OpenRouter-compatible rate-limit headers
## Summary
Add a `standard_headers` boolean option to the `ai-rate-limiting` plugin.
When enabled, the plugin emits rate-limit response headers that follow the
OpenAI / OpenRouter convention, allowing IDE extensions (Cursor, Continue,
etc.)
to detect quota exhaustion and apply automatic back-off without any custom
client-side configuration.
---
## Issue / Motivation
The current `ai-rate-limiting` plugin outputs headers in the format:
```
X-AI-RateLimit-Limit-{instance_name}
X-AI-RateLimit-Remaining-{instance_name}
X-AI-RateLimit-Reset-{instance_name}
```
This format is APISIX-specific and not recognized by popular AI IDE
extensions
such as **Cursor** and **Continue**. These tools look for the
OpenAI/OpenRouter
standard headers:
```
X-RateLimit-Limit-Tokens
X-RateLimit-Remaining-Tokens
X-RateLimit-Reset-Tokens
```
Without these headers, IDE extensions cannot detect that they are being
rate-limited and will keep retrying immediately, causing a poor developer
experience and wasting quota.
---
## Changes
### `apisix/plugins/ai-rate-limiting.lua`
- Added `standard_headers` field to the JSON Schema (`boolean`, default
`false`).
- In `transform_limit_conf()`: when `standard_headers` is `true`, the
`limit_header`, `remaining_header`, and `reset_header` fields passed to
`limit-count` are set to the standard names with a suffix derived from
`limit_strategy`:
| `limit_strategy` | Suffix |
|---------------------|---------------------|
| `total_tokens` | `Tokens` |
| `prompt_tokens` | `PromptTokens` |
| `completion_tokens` | `CompletionTokens` |
- When `standard_headers` is `false` (default), the original
`X-AI-RateLimit-*-{instance_name}` headers are used — **fully backward
compatible**.
### New / updated files
| File | Description |
|------|-------------|
| `apisix/plugins/ai-rate-limiting.lua` | Core change |
| `t/plugin/ai-rate-limiting-standard-headers.t` | Test::Nginx test suite |
| `docs/en/latest/plugins/ai-rate-limiting.md` | Documentation update (see
patch file) |
---
## Test Cases
The new test file `t/plugin/ai-rate-limiting-standard-headers.t` covers:
1. **Schema check** — `standard_headers: true` is accepted by `check_schema`.
2. **Schema default** — `standard_headers` defaults to `false`.
3. **Standard headers present** — a normal request with `standard_headers:
true`
returns all three `X-RateLimit-*-Tokens` headers with numeric values.
4. **429 Remaining = 0** — when the quota is exhausted the 429 response
carries
`X-RateLimit-Remaining-Tokens: 0`.
5. **`prompt_tokens` suffix** — `limit_strategy: prompt_tokens` produces
`X-RateLimit-*-PromptTokens` headers.
6. **`completion_tokens` suffix** — `limit_strategy: completion_tokens`
produces
`X-RateLimit-*-CompletionTokens` headers.
7. **Backward compatibility** — `standard_headers: false` still produces the
legacy `X-AI-RateLimit-*-{instance_name}` headers.
### Running the tests locally
```bash
# Copy sources to Linux filesystem (required for unix socket support)
rm -rf /tmp/apisix-test
cp -r /path/to/apisix /tmp/apisix-test
# Run the new test file
docker run --rm --user root \
-v /tmp/apisix-test:/usr/local/apisix/apisix-src \
apache/apisix:3.15.0-debian bash -c '
apt-get update -qq && apt-get install -y --no-install-recommends
cpanminus git make libwww-perl &&
cpanm --notest Test::Nginx &&
git clone --depth=1 https://github.com/openresty/test-nginx.git
/test-nginx &&
ln -sf /usr/local/apisix/deps /usr/local/apisix/apisix-src/deps &&
cd /usr/local/apisix/apisix-src &&
APISIX_HOME=/usr/local/apisix/apisix-src
TEST_NGINX_BINARY=/usr/bin/openresty \
prove -I/test-nginx/lib -I./ t/plugin/ai-rate-limiting-standard-headers.t
'
```
---
## Documentation
See `docs/en/latest/plugins/ai-rate-limiting-standard-headers-patch.md` for
the
parameter reference table, configuration example, and sample response
headers.
---
## Checklist
- [x] New feature is backward compatible (`standard_headers` defaults to
`false`)
- [x] JSON Schema updated with new field
- [x] Test::Nginx tests added (7 test cases)
- [x] Documentation written
- [ ] `CHANGELOG` entry (to be added before merge)
- [ ] CI passes
---
## Related
- OpenRouter rate-limit header spec:
https://openrouter.ai/docs/api-reference/limits
- OpenAI rate-limit headers:
https://platform.openai.com/docs/guides/rate-limits
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]