[ 
https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16180:
-----------------------------------------
    Description: 
Observed this in internal test run. There is a native memory leak in Orc 
EncodedReaderImpl that can cause YARN pmem monitor to kill the container 
running the daemon. Direct byte buffers are null'ed out which is not guaranteed 
to be cleaned until next Full GC. To show this issue, attaching a small test 
program that allocations 3x256MB direct byte buffers. First buffer is null'ed 
out but still native memory is used. Second buffer user Cleaner to clean up 
native allocation. Third buffer is also null'ed but this time invoking a 
System.gc() which cleans up all native memory. Output from the test program is 
below

{code}
Allocating 3x256MB direct memory..
Native memory used: 786432000
Native memory used after data1=null: 786432000
Native memory used after data2.clean(): 524288000
Native memory used after data3=null: 524288000
Native memory used without gc: 524288000
Native memory used after gc: 0
{code}

Longer term improvements/solutions:
1) Use DirectBufferPool from hadoop or netty's 
https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as direct 
byte buffer allocations are expensive (System.gc() + 100ms thread sleep).
2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9


  was:
Observed this in internal test run. There is a native memory leak in Orc 
EncodedReaderImpl that can cause YARN pmem monitor to kill the container 
running the daemon. Direct byte buffers are null'ed out which is not guaranteed 
to be cleaned until next Full GC. To show this take issue, attaching a small 
test program that allocations 3x256MB direct byte buffers. First buffer is 
null'ed out but still native memory is used. Second buffer user Cleaner to 
clean up native allocation. Third buffer is also null'ed but this time invoking 
a System.gc() which cleans up all native memory. Output from the test program 
is below

{code}
Allocating 3x256MB direct memory..
Native memory used: 786432000
Native memory used after data1=null: 786432000
Native memory used after data2.clean(): 524288000
Native memory used after data3=null: 524288000
Native memory used without gc: 524288000
Native memory used after gc: 0
{code}

Longer term improvements/solutions:
1) Use DirectBufferPool from hadoop or netty's 
https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as direct 
byte buffer allocations are expensive (System.gc() + 100ms thread sleep).
2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9



> LLAP: Native memory leak in EncodedReader
> -----------------------------------------
>
>                 Key: HIVE-16180
>                 URL: https://issues.apache.org/jira/browse/HIVE-16180
>             Project: Hive
>          Issue Type: Bug
>          Components: llap
>    Affects Versions: 2.2.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>            Priority: Critical
>         Attachments: DirectCleaner.java, HIVE-16180.1.patch
>
>
> Observed this in internal test run. There is a native memory leak in Orc 
> EncodedReaderImpl that can cause YARN pmem monitor to kill the container 
> running the daemon. Direct byte buffers are null'ed out which is not 
> guaranteed to be cleaned until next Full GC. To show this issue, attaching a 
> small test program that allocations 3x256MB direct byte buffers. First buffer 
> is null'ed out but still native memory is used. Second buffer user Cleaner to 
> clean up native allocation. Third buffer is also null'ed but this time 
> invoking a System.gc() which cleans up all native memory. Output from the 
> test program is below
> {code}
> Allocating 3x256MB direct memory..
> Native memory used: 786432000
> Native memory used after data1=null: 786432000
> Native memory used after data2.clean(): 524288000
> Native memory used after data3=null: 524288000
> Native memory used without gc: 524288000
> Native memory used after gc: 0
> {code}
> Longer term improvements/solutions:
> 1) Use DirectBufferPool from hadoop or netty's 
> https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as 
> direct byte buffer allocations are expensive (System.gc() + 100ms thread 
> sleep).
> 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to