qianye1001 opened a new issue, #10397:
URL: https://github.com/apache/rocketmq/issues/10397

   ### Before Creating the Bug Report
   
   - [x] I found a bug, not just asking a question, which should be created in 
[GitHub Discussions](https://github.com/apache/rocketmq/discussions).
   
   - [x] I have searched the [GitHub 
Issues](https://github.com/apache/rocketmq/issues) and [GitHub 
Discussions](https://github.com/apache/rocketmq/discussions) of this repository 
and believe that this is not a duplicate.
   
   - [x] I have confirmed that this bug belongs to the current repository, not 
other repositories of RocketMQ.
   
   ### Runtime platform environment
   
   Linux (observed on CentOS 7 / Ubuntu 22.04)
   
   ### RocketMQ version
   
   develop (also affects 5.x releases with TLS hot-reload enabled)
   
   ### JDK Version
   
   JDK 8 / JDK 11, using netty-tcnative (OpenSSL provider)
   
   ### Describe the Bug
   
   When TLS certificates are dynamically reloaded via `TlsCertificateManager` 
(file-watch triggered), a new `SslContext` is created but the old one is never 
explicitly released. Since netty-tcnative's `OpenSslContext` is 
reference-counted and allocates native (off-heap) memory for the certificate 
chain, private key, and SSL session cache, simply dereferencing the old context 
does not free native memory — it relies on GC finalization which may never run 
under low heap pressure.
   
   This causes native memory (RSS) to grow monotonically with each certificate 
rotation cycle. In long-running Proxy/Broker deployments with frequent cert 
rotations (e.g., short-lived certificates rotated every few hours), this 
eventually leads to OOM kills.
   
   ### Steps to Reproduce
   
   1. Enable TLS with OpenSSL provider (`tls.provider=OPENSSL`) on Broker or 
Proxy
   2. Configure certificate hot-reload (`tlsCertWatchIntervalMs`)
   3. Repeatedly replace the certificate files to trigger reload cycles
   4. Monitor native memory (RSS or `jcmd VM.native_memory`) — it grows on each 
reload and never reclaims
   
   ### What Did You Expect to See?
   
   Native memory should remain stable after certificate rotation. The old 
`SslContext` should be released promptly when replaced.
   
   ### What Did You See Instead?
   
   Native memory grows ~200KB–1MB per rotation cycle (depending on cert chain 
length and session cache size) and is never reclaimed until process restart.
   
   ### Additional Context
   
   The fix should call `ReferenceCountUtil.release(oldSslContext)` after the 
new context is installed. Care is needed to defer release until in-flight 
channels using the old context have closed, or use 
`ReferenceCountUtil.safeRelease()` with proper draining logic.
   
   Related: #10302 (SNI multi-domain support) introduces more `SslContext` 
instances per domain, making this leak more severe if not addressed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to