[ 
https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16180:
-----------------------------------------
    Attachment: Native-mem-spike.png
                FullGC-15GB-cleanup.png
                Full-gc-native-mem-cleanup.png

Attaching snapshots from the analysis.
1) Native memory spike happened at around 11:00:00PM (configured offheap cache 
size is 48GB). At this point memory used by direct byte buffer spiked to 
51.7GB. !Native-mem-spike.png!
2) At around same time Full GC triggered and reclaimed 15GB of memory (12GB 
heap + 3 GB offheap) !FullGC-15GB-cleanup.png! !Full-gc-native-mem-cleanup.png!

> LLAP: Native memory leak in EncodedReader
> -----------------------------------------
>
>                 Key: HIVE-16180
>                 URL: https://issues.apache.org/jira/browse/HIVE-16180
>             Project: Hive
>          Issue Type: Bug
>          Components: llap
>    Affects Versions: 2.2.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>            Priority: Critical
>         Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, 
> Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, Native-mem-spike.png
>
>
> Observed this in internal test run. There is a native memory leak in Orc 
> EncodedReaderImpl that can cause YARN pmem monitor to kill the container 
> running the daemon. Direct byte buffers are null'ed out which is not 
> guaranteed to be cleaned until next Full GC. To show this issue, attaching a 
> small test program that allocates 3x256MB direct byte buffers. First buffer 
> is null'ed out but still native memory is used. Second buffer user Cleaner to 
> clean up native allocation. Third buffer is also null'ed but this time 
> invoking a System.gc() which cleans up all native memory. Output from the 
> test program is below
> {code}
> Allocating 3x256MB direct memory..
> Native memory used: 786432000
> Native memory used after data1=null: 786432000
> Native memory used after data2.clean(): 524288000
> Native memory used after data3=null: 524288000
> Native memory used without gc: 524288000
> Native memory used after gc: 0
> {code}
> Longer term improvements/solutions:
> 1) Use DirectBufferPool from hadoop or netty's 
> https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as 
> direct byte buffer allocations are expensive (System.gc() + 100ms thread 
> sleep).
> 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to