qianye1001 commented on issue #10515: URL: https://github.com/apache/rocketmq/issues/10515#issuecomment-4704915982
Thanks for the detailed profiling work — JFR analysis is always appreciated! ## Item 3 (busy-spin fix) — definitely worth fixing This is a real correctness bug: `waitForRunning()` inside the `if` branch means the loop spins without yielding when the interval hasn't elapsed. Agree with the bot's suggestion — this should be a separate bug-fix PR and prioritized. ## Items 1 & 2 (string key caching) — I'd recommend not doing these While JFR does show allocation events for `StringBuilder`/`String` from key concatenation, I don't think caching them is a net improvement. Here's why: **The "problem" is nearly free in practice.** On JDK 9+, `topic + "@" + group` compiles to `invokedynamic` → `StringConcatFactory`, which is already well-optimized. The resulting strings are small, short-lived objects allocated in TLAB (essentially a pointer bump), and collected cheaply in Young GC. Seeing allocation events in JFR `settings=profile` is expected — it samples *all* allocations, but high event count ≠ high cost. **The cache is likely more expensive than what it replaces.** The proposed 2-level `ConcurrentHashMap<String, ConcurrentHashMap<String, String>>` requires two CHM lookups per call (hash computation + volatile reads). Each CHM `Node` entry carries ~32–48 bytes of overhead (object header + hash + key/value references + next pointer). For a moderate number of topic×group pairs, the cache's memory footprint and lookup cost could exceed the cost of simply concatenating a short string. **It also introduces a memory leak risk.** There's no eviction logic in the proposed code — when topics or consumer groups are removed, cached entries remain indefinitely. Over time with topic/group churn, the cache grows without bound. The same reasoning applies to the `String[]` array cache in `PullRequestHoldService` — it adds bounds-checking complexity and dynamic resizing concerns, all to avoid a `new StringBuilder(topic.length() + 10)` that costs a few nanoseconds. **Suggestion:** Focus this issue on the busy-spin fix (item 3), which has clear and measurable impact. For items 1 & 2, it might be worth verifying with a benchmark (e.g., JMH) whether the caching actually improves throughput or latency — my expectation is that the difference would be within noise. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
