qianye1001 opened a new pull request, #10399: URL: https://github.com/apache/rocketmq/pull/10399
## Summary Fix native memory leak caused by old `SslContext` not being released during TLS certificate hot-reload when using the OpenSSL (netty-tcnative) provider. Fixes #10398 ## Root Cause When TLS certificates are dynamically reloaded via `FileWatchService`, the `loadSslContext()` methods in `NettyRemotingServer` and `ProxyAndTlsProtocolNegotiator` directly overwrite the `sslContext` field without releasing the old instance. Since `ReferenceCountedOpenSslContext` allocates native off-heap memory for the certificate chain, private key, and SSL session cache, each reload leaks ~100KB-1MB of native memory per rotation cycle. ## Changes ### File 1: `remoting/src/main/java/org/apache/rocketmq/remoting/netty/NettyRemotingServer.java` - Added `import io.netty.util.ReferenceCountUtil;` - Modified `loadSslContext()`: build new context into local variable, save old reference, assign new context to field, then release old context via `ReferenceCountUtil.release(oldSslContext)` in try-catch ### File 2: `proxy/src/main/java/org/apache/rocketmq/proxy/grpc/ProxyAndTlsProtocolNegotiator.java` - Added `import io.grpc.netty.shaded.io.netty.util.ReferenceCountUtil;` - Changed `sslContext` field to `private static volatile SslContext sslContext;` for thread-safe visibility - Modified `loadSslContext()`: build new context into local variable, save old reference, assign new context to field, then release old context via `ReferenceCountUtil.release(oldSslContext)` in try-catch ## Fix Strategy Uses **"build new, then release old"** ordering to ensure `sslContext` is never null or pointing to a released context during the swap: 1. Build new `SslContext` into a local variable 2. Save old `sslContext` reference 3. Assign new context to the field 4. Release old context via `ReferenceCountUtil.release()` (no-op for non-refcounted JDK SslContext) ## Testing - Existing TLS integration tests cover handshake correctness - Manual verification: run broker with TLS enabled, trigger cert reload 100+ times, confirm stable RSS/native memory ## Backward Compatibility - No public API changes - No configuration changes - No new dependencies (`ReferenceCountUtil` already in Netty transitive deps) - Existing connections unaffected (SslHandler holds its own reference) ## Risk Assessment **LOW** — Minimal, well-isolated lifecycle fix following established Netty ReferenceCounted resource management patterns. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
