[ https://issues.apache.org/jira/browse/HIVE-18252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16286980#comment-16286980 ]
Ashutosh Chauhan commented on HIVE-18252: ----------------------------------------- I wonder if there is really an advantage of this cache. Caching OI seems to be of very little. Perhaps, we shall just delete the cache and factory. > Limit the size of the object inspector caches > --------------------------------------------- > > Key: HIVE-18252 > URL: https://issues.apache.org/jira/browse/HIVE-18252 > Project: Hive > Issue Type: Bug > Components: Types > Reporter: Jason Dere > Assignee: Jason Dere > Attachments: HIVE-18252.1.patch > > > Was running some tests that had a lot of queries with constant values, and > noticed that ObjectInspectorFactory.cachedStandardStructObjectInspector > started using up a lot of memory. > It appears that StructObjectInspector caching does not work properly with > constant values. Constant ObjectInspectors are not cached, so each constant > expression creates a new constant ObjectInspector. And since object > inspectors do not override equals(), object inspector comparison relies on > object instance comparison. So even if the values are exactly the same as > what is already in the cache, the StructObjectInspector cache lookup would > fail, and Hive would create a new object inspector and add it to the cache, > creating another entry that would never be used. Plus, there is no max cache > size - it's just a map that is allowed to grow as long as values keep getting > added to it. > Some possible solutions I can think of: > 1. Limit the size of the object inspector caches, rather than growing without > bound. > 2. Try to fix the caching to work with constant values. This would require > implementing equals() on the constant object inspectors (which could be slow > in nested cases), or else we would have to start caching constant object > inspectors, which could be expensive in terms of memory usage. Could be used > in combination with (1). By itself this is not a great solution because this > still has the unbounded cache growth issue. > 3. Disable caching in the case of constant object inspectors since this > scenario currently doesn't work. This could be used in combination with (1). -- This message was sent by Atlassian JIRA (v6.4.14#64029)