Jalpan Randeri created HIVE-21752:
-------------------------------------

             Summary: Thread Safety and Memory Leaks in 
HCatRecordObjectInspectorFactory
                 Key: HIVE-21752
                 URL: https://issues.apache.org/jira/browse/HIVE-21752
             Project: Hive
          Issue Type: Bug
          Components: HCatalog
            Reporter: Jalpan Randeri


h3. Summary

There are a couple of issues in HCatRecordObjectInspectorFactory[1] because it 
uses a static Java HashMap to cache objects:
 # Java HashMap is not thread safe. This can lead to data corruptions and race 
conditions in multithreaded servers when two threads update the ObjectInspector.
 # There is no eviction policy and as a result, this can result in memory 
leaks. If user reads a lot of different schemas, Hive server will start seeing 
memory pressure, once it start going to have a lot of cached record and object 
inspectors.

This patch propose to replace the cache using a Guava cache which enables cache 
evictions and thread safety. Guava cache is already used in Hive 
ObjectInspectorFactory [2], so this change is consistent with the rest of Hive.

Attached is a patch that fixes this issue.
h3. References:
 # 
https://github.com/apache/hive/blob/b58d50cb73a1f79a5d079e0a2c5ac33d2efc33a0/hcatalog/core/src/main/java/org/apache/hive/hcatalog/data/HCatRecordObjectInspectorFactory.java#L44-L47
 # 
https://github.com/apache/hive/blob/b58d50cb73a1f79a5d079e0a2c5ac33d2efc33a0/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java#L68-L87



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to