qianye1001 opened a new issue, #10398: URL: https://github.com/apache/rocketmq/issues/10398
### Before Creating the Bug Report - [x] I found a bug, not just asking a question, which should be created in [GitHub Discussions](https://github.com/apache/rocketmq/discussions). - [x] I have searched the [GitHub Issues](https://github.com/apache/rocketmq/issues) and [GitHub Discussions](https://github.com/apache/rocketmq/discussions) of this repository and believe that this is not a duplicate. - [x] I have confirmed that this bug belongs to the current repository, not other repositories of RocketMQ. ### Runtime platform environment Linux (observed on CentOS 7 / Ubuntu 22.04) ### RocketMQ version develop (also affects 5.x releases with TLS hot-reload enabled) ### JDK Version JDK 8 / JDK 11, using netty-tcnative (OpenSSL provider) ### Describe the Bug When TLS certificates are dynamically reloaded via `TlsCertificateManager` (file-watch triggered), a new `SslContext` is created but the old one is never explicitly released. Since netty-tcnative's `OpenSslContext` is reference-counted and allocates native (off-heap) memory for the certificate chain, private key, and SSL session cache, simply dereferencing the old context does not free native memory — it relies on GC finalization which may never run under low heap pressure. This causes native memory (RSS) to grow monotonically with each certificate rotation cycle. In long-running Proxy/Broker deployments with frequent cert rotations (e.g., short-lived certificates rotated every few hours), this eventually leads to OOM kills. ### Steps to Reproduce 1. Enable TLS with OpenSSL provider (`tls.provider=OPENSSL`) on Broker or Proxy 2. Configure certificate hot-reload (`tlsCertWatchIntervalMs`) 3. Repeatedly replace the certificate files to trigger reload cycles 4. Monitor native memory (RSS or `jcmd VM.native_memory`) — it grows on each reload and never reclaims ### What Did You Expect to See? Native memory should remain stable after certificate rotation. The old `SslContext` should be released promptly when replaced. ### What Did You See Instead? Native memory grows ~200KB–1MB per rotation cycle (depending on cert chain length and session cache size) and is never reclaimed until process restart. ### Additional Context The fix should call `ReferenceCountUtil.release(oldSslContext)` after the new context is installed. Care is needed to defer release until in-flight channels using the old context have closed, or use `ReferenceCountUtil.safeRelease()` with proper draining logic. Related: #10302 (SNI multi-domain support) introduces more `SslContext` instances per domain, making this leak more severe if not addressed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
