[ https://issues.apache.org/jira/browse/HIVE-21642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Knopov updated HIVE-21642: ------------------------------------ Description: We are continuously loading data into Hive table backed by files in ORC format by appending data in batches. We repeatedly have seen that over a span of few days Hive server experiences {{OutOfMemoryError}} exceptions that we believe are caused by memory leaks. Comparison of heap dumps shows that most suspicious classes that show persistent growth and not recycled with GC are * {{org.apache.hadoop.hive.ql.io.orc.OrcStruct$Field}} * {{org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField}} * {{String}} Sample program used for stress test and heap dumps from 700 to 2500 GB can be uploaded on request. They are too big for Jira backing store was: We are continuously loading data into Hive table stored in ORC format by appending data in batches. We repeatedly have seen that over a span of few days Hive server experience {{OutOfMemoryError}} exceptions that we believe are caused by memory leaks. Comparing heap dumps shows that most suspicious classes that show persistent growth are and not recycled with GC are * {{org.apache.hadoop.hive.ql.io.orc.OrcStruct$Field}} * {{org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField}} * {{String}} Sample program used for stress test and heap dumps from 700 to 2500 GB can be uploaded on request. They are too big for Jira backing store > Hive server leaks memory on data insertion > ------------------------------------------ > > Key: HIVE-21642 > URL: https://issues.apache.org/jira/browse/HIVE-21642 > Project: Hive > Issue Type: Bug > Components: HiveServer2 > Affects Versions: 2.3.4 > Environment: * Amazon Hadoop Distribution emr-5.20.0 > * Master mode with 4 CPU and 16 GB RAM > * Table files stored in S3 cloud storage > Reporter: Alexander Knopov > Priority: Major > > We are continuously loading data into Hive table backed by files in ORC > format by appending data in batches. We repeatedly have seen that over a span > of few days Hive server experiences {{OutOfMemoryError}} exceptions that we > believe are caused by memory leaks. > Comparison of heap dumps shows that most suspicious classes that show > persistent growth and not recycled with GC are > * {{org.apache.hadoop.hive.ql.io.orc.OrcStruct$Field}} > * > {{org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField}} > * {{String}} > Sample program used for stress test and heap dumps from 700 to 2500 GB can be > uploaded on request. They are too big for Jira backing store -- This message was sent by Atlassian JIRA (v7.6.3#76005)