[ 
https://issues.apache.org/jira/browse/KUDU-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860242#comment-16860242
 ] 

Todd Lipcon commented on KUDU-2836:
-----------------------------------

Looking at the code of GcTcmalloc:

{code}
  // Number of bytes in the 'NORMAL' free list (i.e reserved by tcmalloc but
  // not in use).
  int64_t bytes_overhead = GetTCMallocProperty("tcmalloc.pageheap_free_bytes");
  // Bytes allocated by the application.
  int64_t bytes_used = GetTCMallocCurrentAllocatedBytes();
{code}

so it seems like bytes_overhead should be set to 54653.6MB (based on your memz) 
and bytes_used (generic.current_allocated_bytes) is computed as following in 
tcmalloc code:
{code}
      TCMallocStats stats; 
      ExtractStats(&stats, NULL, NULL, NULL);
      *value = stats.pageheap.system_bytes
               - stats.thread_bytes
               - stats.central_bytes
               - stats.transfer_bytes
               - stats.pageheap.free_bytes
               - stats.pageheap.unmapped_bytes;
{code}

Looking at the memz code, this is equivalent to the 'bytes in use by 
application' value (65321MB)

So:
{code}
  int64_t max_overhead = bytes_used * FLAGS_tcmalloc_max_free_bytes_percentage 
/ 100.0;
{code}
should be 65321MB * 10/100 = 6532MB.

bytes_overhead (54653) is much larger than max_overhead, so GcTcmalloc() should 
be calling into 'ReleaseToSystem' to release excess memory until the heap size 
is much smaller.

It seems, then, that there might be a workload on this system so that 
GcTcmalloc isn't getting called. Perhaps there is some allocation-heavy 
workload that doesn't trigger MemTracker::Release -- maybe scan-heavy with no 
writes? In that case, tcmalloc might accumulate a lot of releasable memory over 
time. If this turns out to be the case, we can either try to make sure that 
workload uses memtrackers, or add some other code paths to trigger GcTcmalloc().

Would it be possible to attach with GDB and use 'call 
kudu::process_memory::GcTcmalloc()' to force a GC, then detach? I'm curious to 
see if that then releases memory back to the OS.

Another option would be to try to enable tcmalloc aggressive-decommit in your 
configuration by setting the TCMALLOC_AGGRESSIVE_DECOMMIT=true environment 
variable. This could potentially impact latency but ought to reduce memory 
consumption.

> Maybe wrong memory size used to detect pressure
> -----------------------------------------------
>
>                 Key: KUDU-2836
>                 URL: https://issues.apache.org/jira/browse/KUDU-2836
>             Project: Kudu
>          Issue Type: Improvement
>          Components: tserver
>            Reporter: Yingchun Lai
>            Assignee: Yingchun Lai
>            Priority: Critical
>
> One of my tserver, totally 128G memory, gflags: 
> {code:java}
> -memory_limit_hard_bytes=107374182475 (100G)  
> -memory_limit_soft_percentage=85 -memory_pressure_percentage=80{code}
> Memory used about 95%, "top" result like:
> {code:java}
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 8359 work 20 0 0.326t 0.116t 81780 S 727.9 94.6 230228:10 kudu_tablet_ser
> {code}
> That is kudu_tablet_server process used about 116G memory.
> On mem-trackers page, I find the "Total consumption" value is about 65G, much 
> lower than 116G.
> Then, I login to the server and read code to check any free memory MM 
> operations are work correctly. Unfortunatly, the memory pressure detect 
> function(process_memory::UnderMemoryPressure) doesn't report it's under 
> pressure, because the tcmalloc function GetNumericProperty(const char* 
> property, size_t* value) with parameter "generic.current_allocated_bytes" 
> doesn't return the memory as the memory use reported by the OS.
> [https://gperftools.github.io/gperftools/tcmalloc.html]
> {quote}
> |{{generic.current_allocated_bytes}}|Number of bytes used by the application. 
> This will not typically match the memory use reported by the OS, because it 
> does not include TCMalloc overhead or memory fragmentation.|
> {quote}
> This situation may lead to OPs prefer to free memory could not be scheduled 
> promptly, and the OS memory may consumed empty, and then kill tserver because 
> of OOM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to