[ 
https://issues.apache.org/jira/browse/HIVE-27781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha updated HIVE-27781:
--------------------------------
    Description: 
In SASL transport case Thrift layer uses ByteArrayOutputStream during HMS api 
response data transfer. ByteArrayOutputStream uses byte[] as an underlying 
buffer to store the data to be written.

ByteArrayOutputStream has a constant defined as MAX_ARRAY_SIZE which is 8 byte 
lesser than Integer.MAX_VALUE

*private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;*

While ensuring the new capacity there are two possible cases where the write 
can fail because of size:

case a) When the *MAX_ARRAY_SIZE* <newCapacity <= *Integer.MAX_VALUE*

case b) When the newCapacity > *Integer.MAX_VALUE*

*ByteArrayOutputStream* can detect that new write will lead to case b) and 
throws explicitly OutOfMemoryError.

private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }

 

For case a) It will rely on jvm to throw OutOfMemoryError.

 

if {{*-XX:+ExitOnOutOfMemoryError*}} is passed JVM(in jdk8), behaves 
differently for application throw OutOfMemoryError(case b) and the one which it 
itself does (case a). It honors  this flag only when system throws 
OutOfMemoryError and crashes the process, regardless of how much memory HMS is 
running with.

This is observed mainly in getPartition cases where table contains large number 
of partitions resulting  in HMS crash while the Serialized bytes of Partitions 
objects gets written to ByteArrayOutputStream's buffer.

 

> HMS crashes with OOM even though there is enough heap space
> -----------------------------------------------------------
>
>                 Key: HIVE-27781
>                 URL: https://issues.apache.org/jira/browse/HIVE-27781
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Pravin Sinha
>            Assignee: Pravin Sinha
>            Priority: Major
>
> In SASL transport case Thrift layer uses ByteArrayOutputStream during HMS api 
> response data transfer. ByteArrayOutputStream uses byte[] as an underlying 
> buffer to store the data to be written.
> ByteArrayOutputStream has a constant defined as MAX_ARRAY_SIZE which is 8 
> byte lesser than Integer.MAX_VALUE
> *private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;*
> While ensuring the new capacity there are two possible cases where the write 
> can fail because of size:
> case a) When the *MAX_ARRAY_SIZE* <newCapacity <= *Integer.MAX_VALUE*
> case b) When the newCapacity > *Integer.MAX_VALUE*
> *ByteArrayOutputStream* can detect that new write will lead to case b) and 
> throws explicitly OutOfMemoryError.
> private static int hugeCapacity(int minCapacity) {
>         if (minCapacity < 0) // overflow
>             throw new OutOfMemoryError();
>         return (minCapacity > MAX_ARRAY_SIZE) ?
>             Integer.MAX_VALUE :
>             MAX_ARRAY_SIZE;
>     }
>  
> For case a) It will rely on jvm to throw OutOfMemoryError.
>  
> if {{*-XX:+ExitOnOutOfMemoryError*}} is passed JVM(in jdk8), behaves 
> differently for application throw OutOfMemoryError(case b) and the one which 
> it itself does (case a). It honors  this flag only when system throws 
> OutOfMemoryError and crashes the process, regardless of how much memory HMS 
> is running with.
> This is observed mainly in getPartition cases where table contains large 
> number of partitions resulting  in HMS crash while the Serialized bytes of 
> Partitions objects gets written to ByteArrayOutputStream's buffer.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to