Doyle-net commented on issue #3964:
URL: https://github.com/apache/hertzbeat/issues/3964#issuecomment-3747976713

   > Thanks for reporting this. Based on the codebase analysis, this appears to 
be a **memory leak in Apache Arrow direct buffer management**, likely caused by 
multiple resource management issues:
   > 
   > **Evidence:**
   > 
   > * `CollectRep.MetricsData.Builder.build()` creates a new `RootAllocator` 
per call 
([CollectRep.java:410](https://github.com/apache/hertzbeat/blob/main/hertzbeat-common/src/main/java/org/apache/hertzbeat/common/entity/message/CollectRep.java#L410))
 but only closes it on exception ([line 
463](https://github.com/apache/hertzbeat/blob/main/hertzbeat-common/src/main/java/org/apache/hertzbeat/common/entity/message/CollectRep.java#L463)),
 not in the normal path
   > * `MetricsData` implements `AutoCloseable` but most data flow paths 
(Collector → Queue → Alerter → Storage) don't call `close()` 
([CommonDispatcher.java:310](https://github.com/apache/hertzbeat/blob/main/hertzbeat-collector/hertzbeat-collector-collector/src/main/java/org/apache/hertzbeat/collector/dispatch/CommonDispatcher.java#L310))
   > * `RedisMetricsDataCodec` creates its own `RootAllocator` per instance 
that is never closed 
([RedisMetricsDataCodec.java:47](https://github.com/apache/hertzbeat/blob/main/hertzbeat-common/src/main/java/org/apache/hertzbeat/common/serialize/RedisMetricsDataCodec.java#L47))
   > 
   > With 438 collection tasks running continuously, even a small per-task leak 
accumulates to ~10GB over 8-9 hours, matching your error state.
   > 
   > **To confirm the diagnosis:**
   > 
   > 1. Which queue implementation are you using (Redis/Memory/Kafka)?
   > 2. Can you share your complete JVM startup parameters?
   > 
   > **Workaround while fixing:** Increase `-XX:MaxDirectMemorySize` to 16g+ 
and reduce concurrent task count.
   
   Currently using in-memory queues.
   Increasing the MaxDirectMemorySize configuration still doesn't solve the 
problem; analysis of the source code reveals a serious memory leak bug in the 
queue implementation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to