[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905946#comment-15905946 ]
Prasanth Jayachandran edited comment on HIVE-16180 at 3/11/17 12:53 AM: ------------------------------------------------------------------------ Attaching snapshots from the analysis. 1) Native memory spike happened at around 11:00:00PM (configured offheap cache size is 48GB). At this point memory used by direct byte buffer spiked to 51.7GB. !Native-mem-spike.png! 2) At around same time Full GC triggered and reclaimed 15GB of memory (12GB heap + 3 GB offheap) !FullGC-15GB-cleanup.png! !Full-gc-native-mem-cleanup.png! was (Author: prasanth_j): Attaching snapshots from the analysis. 1) Native memory spike happened at around 11:00:00PM (configured offheap cache size is 48GB). At this point memory used by direct byte buffer spiked to 51.7GB. !Native-mem-spike.png|thumbnail! 2) At around same time Full GC triggered and reclaimed 15GB of memory (12GB heap + 3 GB offheap) !FullGC-15GB-cleanup.png|thumbnail! !Full-gc-native-mem-cleanup.png|thumbnail! > LLAP: Native memory leak in EncodedReader > ----------------------------------------- > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap > Affects Versions: 2.2.0 > Reporter: Prasanth Jayachandran > Assignee: Prasanth Jayachandran > Priority: Critical > Attachments: DirectCleaner.java, Full-gc-native-mem-cleanup.png, > HIVE-16180.1.patch, Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)