[ https://issues.apache.org/jira/browse/KUDU-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860593#comment-16860593 ]
Yingchun Lai commented on KUDU-2836: ------------------------------------ Once mem_tracker_->Release(bytes) is called and bytes is positive, GcTcmalloc() will be called if releasable memory has accumulated a lot, right? I found many mem_tracker_->Release in the code, is there a situation that it will never called? > Maybe wrong memory size used to detect pressure > ----------------------------------------------- > > Key: KUDU-2836 > URL: https://issues.apache.org/jira/browse/KUDU-2836 > Project: Kudu > Issue Type: Improvement > Components: tserver > Reporter: Yingchun Lai > Assignee: Yingchun Lai > Priority: Critical > Attachments: 选区_313.jpg > > > One of my tserver, totally 128G memory, gflags: > {code:java} > -memory_limit_hard_bytes=107374182475 (100G) > -memory_limit_soft_percentage=85 -memory_pressure_percentage=80{code} > Memory used about 95%, "top" result like: > {code:java} > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 8359 work 20 0 0.326t 0.116t 81780 S 727.9 94.6 230228:10 kudu_tablet_ser > {code} > That is kudu_tablet_server process used about 116G memory. > On mem-trackers page, I find the "Total consumption" value is about 65G, much > lower than 116G. > Then, I login to the server and read code to check any free memory MM > operations are work correctly. Unfortunatly, the memory pressure detect > function(process_memory::UnderMemoryPressure) doesn't report it's under > pressure, because the tcmalloc function GetNumericProperty(const char* > property, size_t* value) with parameter "generic.current_allocated_bytes" > doesn't return the memory as the memory use reported by the OS. > [https://gperftools.github.io/gperftools/tcmalloc.html] > {quote} > |{{generic.current_allocated_bytes}}|Number of bytes used by the application. > This will not typically match the memory use reported by the OS, because it > does not include TCMalloc overhead or memory fragmentation.| > {quote} > This situation may lead to OPs prefer to free memory could not be scheduled > promptly, and the OS memory may consumed empty, and then kill tserver because > of OOM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)