Jason Dere created HIVE-18252:
---------------------------------
Summary: Limit the size of the object inspector caches
Key: HIVE-18252
URL: https://issues.apache.org/jira/browse/HIVE-18252
Project: Hive
Issue Type: Bug
Components: Types
Reporter: Jason Dere
Assignee: Jason Dere
Was running some tests that had a lot of queries with constant values, and
noticed that ObjectInspectorFactory.cachedStandardStructObjectInspector started
using up a lot of memory.
It appears that StructObjectInspector caching does not work properly with
constant values. Constant ObjectInspectors are not cached, so each constant
expression creates a new constant ObjectInspector. And since object inspectors
do not override equals(), object inspector comparison relies on object instance
comparison. So even if the values are exactly the same as what is already in
the cache, the StructObjectInspector cache lookup would fail, and Hive would
create a new object inspector and add it to the cache, creating another entry
that would never be used. Plus, there is no max cache size - it's just a map
that is allowed to grow as long as values keep getting added to it.
Some possible solutions I can think of:
1. Limit the size of the object inspector caches, rather than growing without
bound.
2. Try to fix the caching to work with constant values. This would require
implementing equals() on the constant object inspectors (which could be slow in
nested cases), or else we would have to start caching constant object
inspectors, which could be expensive in terms of memory usage. Could be used in
combination with (1). By itself this is not a great solution because this still
has the unbounded cache growth issue.
3. Disable caching in the case of constant object inspectors since this
scenario currently doesn't work. This could be used in combination with (1).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)