qianye1001 commented on issue #10302:
URL: https://github.com/apache/rocketmq/issues/10302#issuecomment-4542898215
# Implementation Spec — apache/rocketmq#10302
**Feature:** SNI multi-domain certificate support for Proxy TLS
**Branch:** `develop`
**Date:** 2026-05-26
**Status:** Verified ✅ — ready for implementation
---
## 1. Context & Verified Current State
All claims in the issue have been verified against `apache/rocketmq@develop`:
| Claim | File | Verified |
|---|---|---|
| `ProxyConfig` only has `tlsCertPath` / `tlsKeyPath` |
`proxy/src/main/java/org/apache/rocketmq/proxy/config/ProxyConfig.java:82-86` |
✅ |
| `TlsCertificateManager` watches single cert/key pair |
`proxy/src/main/java/org/apache/rocketmq/proxy/service/cert/TlsCertificateManager.java:35-49,86-116`
| ✅ |
| gRPC negotiator uses single static `SslContext`, no SNI |
`proxy/src/main/java/org/apache/rocketmq/proxy/grpc/ProxyAndTlsProtocolNegotiator.java:80,106-139,253-302`
| ✅ |
| Remoting TLS helper builds single context (ALPN only) |
`proxy/src/main/java/org/apache/rocketmq/proxy/remoting/MultiProtocolTlsHelper.java:53-99`
| ✅ |
| `NettyRemotingServer.TlsModeHandler` uses plain `SslHandler` |
`remoting/src/main/java/org/apache/rocketmq/remoting/netty/NettyRemotingServer.java:123,180-186,485-536`
| ✅ |
| No `SniHandler`, `TlsSniManager`, `TlsDomainConfig`, etc. anywhere |
repo-wide grep | ✅ |
> Note: `MultiProtocolTlsHelper` is named for ALPN multiplexing (HTTP/2 vs
remoting), not multi-certificate. It uses the global static
`TlsSystemConfig.tlsServerCertPath`/`tlsServerKeyPath`.
> Issue #10296 is a duplicate and already closed; #10302 is canonical.
---
## 2. Goals & Non-Goals
### Goals
- Serve **multiple top-level domains** with different certificates on the
**same Proxy port** for both gRPC and Remoting protocols.
- Pure-additive configuration: when `tlsDomainConfigs` is empty, runtime
behavior is **identical** to the current single-cert mode.
- Independent hot-reload per cert/key pair (file watcher).
- Wildcard hostname matching, falling back to the default cert when no
domain rule matches.
### Non-Goals
- Multi-cert support in the **broker / NameServer / core remoting** beyond
what Proxy needs (those still use a single global `TlsSystemConfig`). Out of
scope for this issue.
- mTLS / client-cert-based selection.
- ACME / Let's Encrypt automation.
- Per-domain cipher suite or protocol-version overrides.
---
## 3. High-Level Design
```
┌──────────────────────────────────────────┐
│ ProxyConfig │
│ │
│ tlsCertPath / tlsKeyPath (default) │
│ tlsDomainConfigs: │
│ "*.example.com" -> {cert, key} │
│ "*.sample.org" -> {cert, key} │
└─────────────────┬────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ TlsSniManager │
│ - default SslContext │
│ - Map<pattern, SslContext> │
│ - Mapping<String, SslContext> (Netty) │
│ - reloadDomain(pattern) │
└─────────────────┬────────────────────────┘
│ DomainNameMapping / custom Mapping
┌───────────────┴────────────────┐
▼ ▼
gRPC pipeline (negotiator) Remoting pipeline
┌──────────────────────┐ ┌──────────────────────┐
│ HAProxyDecoder │ │ HAProxyDecoder │
│ SniHandler ──┐ │ │ TlsModeHandler │
│ ▼ │ │ └─ SniHandler ─┐ │
│ SslHandler │ │ ▼ │
│ HTTP/2 framer │ │ SslHandler│
└──────────────────────┘ └──────────────────────┘
▲
│ file change events
┌───────────┴────────────┐
│ TlsCertificateManager │ (one FileWatchService
│ - per-domain watchers │ per cert+key pair)
└────────────────────────┘
```
---
## 4. File-Level Changes
### 4.1 New files (4)
| Path | Purpose |
|---|---|
|
`proxy/src/main/java/org/apache/rocketmq/proxy/config/TlsDomainConfig.java` |
POJO: `certPath`, `keyPath`, optional `keyPassword`. Jackson-friendly (no-arg
ctor + getters/setters). |
|
`proxy/src/main/java/org/apache/rocketmq/proxy/service/cert/TlsSniManager.java`
| Holds default `SslContext` + `Map<String pattern, SslContext>`. Exposes a
Netty `Mapping<String, SslContext>` for `SniHandler`. Provides `reload(String
pattern)` and `reloadDefault()`. Thread-safe via `volatile` references and
`ConcurrentHashMap`. |
|
`proxy/src/main/java/org/apache/rocketmq/proxy/service/cert/SniHostnameMatcher.java`
| Pure-function matcher implementing the wildcard rules (§5). Unit-testable in
isolation. |
|
`proxy/src/main/java/org/apache/rocketmq/proxy/service/cert/TlsContextProvider.java`
| Indirection used by remoting `TlsModeHandler` to obtain either a
`Mapping<String, SslContext>` (SNI mode) or a single `SslContext` (legacy
mode). Lets remoting and gRPC share the same SNI manager. |
### 4.2 Modified files (8)
| Path | Change |
|---|---|
| `proxy/src/main/java/org/apache/rocketmq/proxy/config/ProxyConfig.java` |
Add `private Map<String, TlsDomainConfig> tlsDomainConfigs = new HashMap<>();`
+ getter/setter. Keep existing `tlsCertPath`/`tlsKeyPath` as the default
fallback. |
|
`proxy/src/main/java/org/apache/rocketmq/proxy/service/cert/TlsCertificateManager.java`
| Refactor to support N watched (cert, key) pairs. Internally keep a list of
per-pair `FileWatchService` + listener. Each listener fires `onReload(pattern)`
to `TlsSniManager` (or `onDefaultReload()`). Preserve existing single-pair
behavior when `tlsDomainConfigs` is empty. |
|
`proxy/src/main/java/org/apache/rocketmq/proxy/grpc/ProxyAndTlsProtocolNegotiator.java`
| Replace static single `SslContext` with a `TlsSniManager`. In
`TlsModeHandler` pipeline, when bytes indicate TLS, insert `new
SniHandler(tlsSniManager.asMapping())` instead of pre-baking an
`InternalProtocolNegotiators.serverTls(ctx).newHandler(...)`. Use
`SniHandler#newSslHandler(SslContext, ByteBufAllocator)` override hook to wrap
the chosen context with gRPC's protocol negotiator (preserves ALPN/HTTP-2
behavior). When `tlsDomainConfigs` is empty, fall back to legacy code path
verbatim (no SNI handler). |
|
`proxy/src/main/java/org/apache/rocketmq/proxy/remoting/MultiProtocolTlsHelper.java`
| Add overload `buildSniContextProvider(ProxyConfig)` returning a
`TlsContextProvider`. Existing `buildSslContext()` retained for back-compat. |
|
`proxy/src/main/java/org/apache/rocketmq/proxy/remoting/MultiProtocolRemotingServer.java`
| Wire `TlsContextProvider` into the server. When SNI mode active, install
`SniHandler` in pipeline before the protocol decoder; otherwise keep current
`SslHandler`. |
|
`remoting/src/main/java/org/apache/rocketmq/remoting/netty/NettyRemotingServer.java`
| Extend `TlsModeHandler` constructor to accept an optional
`TlsContextProvider`. If present, add `new SniHandler(provider.mapping())` to
pipeline instead of `sslContext.newHandler(...)`. **No behavior change** when
provider is null — preserves broker/nameserver behavior. |
| `proxy/src/main/java/org/apache/rocketmq/proxy/ProxyStartup.java` | Build
`TlsSniManager` from `ProxyConfig` once; pass it into both gRPC negotiator
construction and remoting helper. |
| `proxy/src/main/java/org/apache/rocketmq/proxy/grpc/GrpcServer.java` (or
equivalent builder) | Pass `TlsSniManager` reference into
`ProxyAndTlsProtocolNegotiator` constructor. |
### 4.3 New test files (4)
| Path | Coverage |
|---|---|
|
`proxy/src/test/java/org/apache/rocketmq/proxy/service/cert/SniHostnameMatcherTest.java`
| Wildcard semantics matrix (§5). |
|
`proxy/src/test/java/org/apache/rocketmq/proxy/service/cert/TlsSniManagerTest.java`
| Mapping lookup, fallback, concurrent reload, missing-file behavior. |
|
`proxy/src/test/java/org/apache/rocketmq/proxy/grpc/ProxyAndTlsProtocolNegotiatorSniTest.java`
| `EmbeddedChannel` ClientHello with SNI extension → expected `SslContext`
chosen. Legacy mode (no `tlsDomainConfigs`) still works. |
|
`proxy/src/test/java/org/apache/rocketmq/proxy/remoting/MultiProtocolTlsHelperSniTest.java`
| Remoting path: SNI handler installed, ALPN unchanged. |
Existing `ProxyAndTlsProtocolNegotiatorTest` and `TlsCertificateManagerTest`
must remain green (no regressions in single-cert mode).
---
## 5. Wildcard Matching Algorithm (`SniHostnameMatcher`)
Input: hostname `h` from ClientHello SNI (already lowercased by Netty).
Configured patterns: set `P` (lowercased at load).
```
1. If h ∈ P → return P[h] // exact match (O(1))
2. Split h into labels [l0, l1, …, lN]
3. For i in 1..N: // try progressively
shorter suffixes
candidate = "*." + join(l_i … l_N, ".")
if candidate ∈ P:
// label-count guard: a single "*" matches exactly one label
if (N - i) == (labels(candidate) - 1):
return P[candidate]
4. If h has form "x.y.…" and "*.x.y.…" ∈ P with one extra label → handled by
step 3
5. Bare-domain fallback: if h == "example.com" and "*.example.com" ∈ P →
return P["*.example.com"]
(treat bare apex as if it had an empty leading label that matches "*")
6. Otherwise → return defaultContext
```
Matrix (must be covered by `SniHostnameMatcherTest`):
| Hostname | Pattern | Result |
|---|---|---|
| `foo.example.com` | `*.example.com` | match |
| `example.com` | `*.example.com` | match (bare-domain rule) |
| `a.b.example.com` | `*.example.com` | **no** match (multi-level) |
| `foo.example.com` | `foo.example.com` | exact match (priority over
wildcard) |
| `bar.sample.org` | `*.example.com` | no match → default |
| `EXAMPLE.com` (uppercase) | `*.example.com` | match (case-insensitive) |
| `null` / empty SNI | any | default |
---
## 6. Configuration
### YAML example
```yaml
tlsTestModeEnable: false
tlsCertPath: /etc/rocketmq/tls/default.crt
tlsKeyPath: /etc/rocketmq/tls/default.key
tlsCertWatchIntervalMs: 3600000
tlsDomainConfigs:
"*.example.com":
certPath: /etc/rocketmq/tls/example.crt
keyPath: /etc/rocketmq/tls/example.key
"*.sample.org":
certPath: /etc/rocketmq/tls/sample.crt
keyPath: /etc/rocketmq/tls/sample.key
```
### JSON (Jackson-deserializable)
```json
{
"tlsCertPath": "/etc/rocketmq/tls/default.crt",
"tlsKeyPath": "/etc/rocketmq/tls/default.key",
"tlsDomainConfigs": {
"*.example.com": { "certPath": "/etc/rocketmq/tls/example.crt",
"keyPath": "/etc/rocketmq/tls/example.key" }
}
}
```
Validation at startup (in `ProxyStartup`):
- Each domain config must have non-blank `certPath` and `keyPath`.
- Files must exist and be readable.
- Pattern must match regex `^(\*\.)?([a-z0-9-]+\.)+[a-z]{2,}$`
(case-insensitive, normalized to lower-case).
- Patterns starting with `*.` may have only one wildcard at the leading
position.
- Duplicate patterns → fail-fast.
---
## 7. Backward Compatibility
- When `tlsDomainConfigs` is empty (or absent), `TlsSniManager` is **not
constructed**; the negotiator/remoting take the existing legacy code paths
byte-for-byte.
- `TlsCertificateManager`'s public surface preserved; new multi-watcher
logic only activates when more than the default pair is registered.
- No changes to `TlsSystemConfig` (global statics) — broker/nameserver
unaffected.
- No new mandatory CLI flags; no protocol-level changes; rolling upgrade
compatible.
---
## 8. Risks & Mitigations
| Risk | Mitigation |
|---|---|
| gRPC's `InternalProtocolNegotiators.serverTls(...)` expects the
`SslContext` at handler-construction time; using `SniHandler` defers context
selection. | Subclass `SniHandler` and override `newSslHandler(SslContext,
ByteBufAllocator)` to invoke gRPC's negotiator with the resolved context (the
same wiring it does internally). Cover with `EmbeddedChannel` test sending a
real ClientHello bytestream. |
| Pipeline ordering with HAProxy protocol decoder. | Install order remains:
`HAProxyMessageDecoder` → `TlsModeHandler` → (`SniHandler` → `SslHandler`) →
app handlers. SNI handler must come **after** any proxy-protocol stripping so
the first bytes it inspects are the real ClientHello. |
| Hot-reload race: a connection mid-handshake while context swaps. | New
connections only pick up the new context. In-flight handshake uses the snapshot
it captured at `SniHandler.newSslHandler` time. Use `volatile` reference swap
inside `TlsSniManager` — no locking on hot path. |
| Misconfigured pattern silently falls through to default cert (browser
shows "wrong domain"). | Log a `WARN` once per unique unmatched SNI hostname
(rate-limited). |
| Missing/unreadable cert at runtime reload. | Keep the previous good
context; log `ERROR`; expose a metric
`proxy_tls_cert_reload_failures_total{pattern=...}`. |
| Memory cost of N `SslContext` instances. | Negligible (KBs each); document
recommended cap ~50 domains. |
---
## 9. Test Plan
### Unit
- `SniHostnameMatcherTest` — full matrix from §5.
- `TlsSniManagerTest` — context registration, lookup, reload, concurrent
reload while resolving (use `CountDownLatch`).
- `TlsCertificateManagerTest` — extended for multi-pair watchers; ensure
single-pair legacy path unchanged.
### Integration (Netty `EmbeddedChannel`)
- Feed a synthetic ClientHello with SNI = `foo.example.com`; assert the
resolved `SslContext` corresponds to `*.example.com`.
- Feed ClientHello with no SNI; assert default context.
- Feed ClientHello with SNI = `unknown.test`; assert default context + WARN
log.
- Verify ALPN still negotiates `h2` on gRPC pipeline after SNI resolution.
### Manual / E2E
- `openssl s_client -connect proxy:8443 -servername foo.example.com
-showcerts` → returns `example.crt`.
- `openssl s_client -connect proxy:8443 -servername foo.sample.org
-showcerts` → returns `sample.crt`.
- `openssl s_client -connect proxy:8443 -servername other.test -showcerts` →
returns `default.crt`.
- Touch a domain cert on disk; within `tlsCertWatchIntervalMs`, new
connections present the updated cert (existing connections unaffected).
- Restart Proxy with `tlsDomainConfigs` removed → behaves identically to
current release.
### Compatibility
- Run full existing TLS test suite
(`proxy/src/test/java/.../ProxyAndTlsProtocolNegotiatorTest`,
`TlsCertificateManagerTest`, `remoting/.../TlsTest`) — must pass without
modification.
---
## 10. Rollout
1. PR #1: introduce `TlsDomainConfig`, `SniHostnameMatcher`, `TlsSniManager`
(with tests) — no wiring yet. Safe to merge.
2. PR #2: extend `TlsCertificateManager` for multi-pair watching (with
tests).
3. PR #3: wire gRPC negotiator + remoting helper + `ProxyStartup`; gated on
non-empty `tlsDomainConfigs`.
4. Docs: update `docs/cn/Configuration_TLS.md` and
`docs/en/Configuration_TLS.md` with SNI section + YAML example.
---
## 11. Open Questions
1. Should the default cert be optional when `tlsDomainConfigs` is set (i.e.
require SNI from clients)? Current design keeps the default mandatory — simpler
and avoids handshake failures for legacy clients. Recommend: keep default
mandatory.
2. Should `TlsSniManager` also be wired into `NettyRemotingServer` outside
of proxy (broker/namesrv)? Out of scope for this issue; track as a follow-up.
3. Metric/log namespace conventions — align with
`org.apache.rocketmq.proxy.metrics.*`.
---
## 12. Acceptance Criteria
- [ ] `tlsDomainConfigs` accepted in `proxy.json` / yaml and parsed into
`Map<String, TlsDomainConfig>`.
- [ ] gRPC and Remoting both serve correct cert per SNI hostname on the same
port.
- [ ] Wildcard matching matrix (§5) fully covered by unit tests.
- [ ] Hot-reload of any single cert/key pair does not interrupt other
domains' traffic.
- [ ] Empty/absent `tlsDomainConfigs` → bit-identical behavior to current
release (verified by existing test suite).
- [ ] Documentation updated in both `docs/cn` and `docs/en`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]