QiuYucheng2003 opened a new issue, #4714: URL: https://github.com/apache/bookkeeper/issues/4714
**BUG REPORT** ***Describe the bug*** A critical "Large Object Retention" (Type I) memory leak exists in `LegacyHierarchicalLedgerManager.java`. The class uses a static `ThreadLocal<StringBuilder>` named `threadLocalNodeBuilder` to build Zookeeper paths to avoid object allocation. Inside `getLedgerRangeByLevel()`, the code clears the builder using `nodeBuilder.setLength(0)` before reusing it. However, `setLength(0)` only resets the character counter; it **does not** shrink or release the underlying `char[]` backing array. If a worker thread processes an abnormally long path or experiences a burst of data, the internal array dynamically expands to a "high-water mark". Because `ThreadLocal.remove()` is never invoked, these bloated `char[]` arrays are permanently anchored to long-lived worker threads. Over time, they promote to the Old Generation, causing severe heap fragmentation and unbounded memory expansion. ***To Reproduce*** Steps to reproduce the behavior: 1. Run Apache BookKeeper under high metadata load involving exceptionally long Znode paths. 2. Trigger operations that continuously invoke `getLedgerRangeByLevel(level1, level2)` across the thread pool. 3. The `threadLocalNodeBuilder` expands its internal array to accommodate the largest processed path. 4. Take a JVM Heap Dump and observe large, mostly empty `char[]` instances lingering in the Old Generation, strongly referenced by the `ThreadLocalMap` of the I/O worker threads. ***Expected behavior*** Memory should not be permanently retained by thread pools. To fix this, the system should either: 1. Allocate a local `StringBuilder` inside the method. Modern JVM Escape Analysis handles short-lived object allocation extremely efficiently. 2. If caching is strictly necessary, enforce a maximum capacity limit. If the `StringBuilder`'s capacity exceeds a safe threshold (e.g., 4KB) after use, it should be discarded and replaced with a new instance rather than blindly re-queued. ***Screenshots*** Not applicable (Heap dump footprint analysis describes the issue). ***Additional context*** This is a well-known anti-pattern in high-throughput environments. Relying on `ThreadLocal` to cache dynamically expanding buffers without a strict shrinkage policy inevitably leads to physical memory deadlocks in long-lived thread pools. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
