[ 
https://issues.apache.org/jira/browse/THRIFT-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17897197#comment-17897197
 ] 

Sercan Tekin commented on THRIFT-3914:
--------------------------------------

I hit the same issue with Hive when authentication is enabled.

{code:java}
Exception in thread "pool-7-thread-7" java.lang.OutOfMemoryError
        at 
java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123)
        at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117)
        at 
java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
        at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
        at 
org.apache.thrift.transport.TSaslTransport.write(TSaslTransport.java:473)
        at 
org.apache.thrift.transport.TSaslServerTransport.write(TSaslServerTransport.java:42)
        at 
org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:227)
        at 
org.apache.hadoop.hive.metastore.api.FieldSchema$FieldSchemaStandardScheme.write(FieldSchema.java:517)
        at 
org.apache.hadoop.hive.metastore.api.FieldSchema$FieldSchemaStandardScheme.write(FieldSchema.java:456)
        at 
org.apache.hadoop.hive.metastore.api.FieldSchema.write(FieldSchema.java:394)
        at 
org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1423)
        at 
org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1250)
        at 
org.apache.hadoop.hive.metastore.api.StorageDescriptor.write(StorageDescriptor.java:1116)
        at 
org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:1033)
        at 
org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:890)
        at 
org.apache.hadoop.hive.metastore.api.Partition.write(Partition.java:786) 
{code}

I believe that the reason is Java's conservative approach to max Array size, 
please see 
[this|https://github.com/openjdk/jdk/blob/0e0dfca21f64ecfcb3e5ed7cdc2a173834faa509/src/java.base/share/classes/java/io/InputStream.java#L307-L313].

Spark has followed the Java's approach to fix the issue 
https://github.com/apache/spark/commit/e5a5921968c84601ce005a7785bdd08c41a2d862#diff-607488c104788f0156de87abab394cf33aa76148b1e3d122d328e165a25c1838R22

 

> TSaslServerTransport throws OOM due to BetaArrayOutputStream limitation
> -----------------------------------------------------------------------
>
>                 Key: THRIFT-3914
>                 URL: https://issues.apache.org/jira/browse/THRIFT-3914
>             Project: Thrift
>          Issue Type: Bug
>          Components: Java - Library
>    Affects Versions: 0.9.3
>            Reporter: Chaoyu Tang
>            Priority: Major
>
> TSaslServerTransport uses the BetaArrayOutputStream as its write buffer, but 
> the BetaArrayOutputStream has buffer size limitation with maximum Integer 
> MAX_VALUE (2,147,483,647) bytes. If it needs write the result exceeding this 
> limitation, it will throw OutOfMemoryError with msg "Requested array size 
> exceeds VM limit". Following is the stack trace from a Hive use case:
> {code}
> Exception in thread "pool-6-thread-9" java.lang.OutOfMemoryError: Requested 
> array size exceeds VM limit
>         at java.util.Arrays.copyOf(Arrays.java:2271)
>         at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
>         at 
> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
>         at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
>         at 
> org.apache.thrift.transport.TSaslTransport.write(TSaslTransport.java:476)
>         at 
> org.apache.thrift.transport.TSaslServerTransport.write(TSaslServerTransport.java:41)
>         at 
> org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:202)
>         at 
> org.apache.hadoop.hive.metastore.api.SerDeInfo$SerDeInfoStandardScheme.write(SerDeInfo.java:579)
>         at 
> org.apache.hadoop.hive.metastore.api.SerDeInfo$SerDeInfoStandardScheme.write(SerDeInfo.java:501)
>         at 
> org.apache.hadoop.hive.metastore.api.SerDeInfo.write(SerDeInfo.java:439)
>         at 
> org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1490)
>         at 
> org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1288)
>         at 
> org.apache.hadoop.hive.metastore.api.StorageDescriptor.write(StorageDescriptor.java:1154)
>         at 
> org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:1072)
>         at 
> org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:929)
>         at 
> org.apache.hadoop.hive.metastore.api.Partition.write(Partition.java:825)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result$get_partitions_resultStandardScheme.write(ThriftHiveMetastore.java)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result$get_partitions_resultStandardScheme.write(ThriftHiveMetastore.java)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.write(ThriftHiveMetastore.java:65485)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>         at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:707)
>         at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:702)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
>         at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:702)
>         at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> {code} 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to