[ https://issues.apache.org/jira/browse/HIVE-18252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16284226#comment-16284226 ]
Jason Dere commented on HIVE-18252: ----------------------------------- I don't really want to go through the process of adding equals/hashcode to all of the object inspectors - they were not created with this in mind and none of them implement it. I'll add a check to disable caching in the case of constant object inspectors, in addition to limiting the amount of object inspectors kept in the cache. > Limit the size of the object inspector caches > --------------------------------------------- > > Key: HIVE-18252 > URL: https://issues.apache.org/jira/browse/HIVE-18252 > Project: Hive > Issue Type: Bug > Components: Types > Reporter: Jason Dere > Assignee: Jason Dere > > Was running some tests that had a lot of queries with constant values, and > noticed that ObjectInspectorFactory.cachedStandardStructObjectInspector > started using up a lot of memory. > It appears that StructObjectInspector caching does not work properly with > constant values. Constant ObjectInspectors are not cached, so each constant > expression creates a new constant ObjectInspector. And since object > inspectors do not override equals(), object inspector comparison relies on > object instance comparison. So even if the values are exactly the same as > what is already in the cache, the StructObjectInspector cache lookup would > fail, and Hive would create a new object inspector and add it to the cache, > creating another entry that would never be used. Plus, there is no max cache > size - it's just a map that is allowed to grow as long as values keep getting > added to it. > Some possible solutions I can think of: > 1. Limit the size of the object inspector caches, rather than growing without > bound. > 2. Try to fix the caching to work with constant values. This would require > implementing equals() on the constant object inspectors (which could be slow > in nested cases), or else we would have to start caching constant object > inspectors, which could be expensive in terms of memory usage. Could be used > in combination with (1). By itself this is not a great solution because this > still has the unbounded cache growth issue. > 3. Disable caching in the case of constant object inspectors since this > scenario currently doesn't work. This could be used in combination with (1). -- This message was sent by Atlassian JIRA (v6.4.14#64029)