[ https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606353#comment-16606353 ]
Sahil Takiar commented on HIVE-20440: ------------------------------------- Few high level comments: (1) The Guava cache only performs cleanup on writes, and occasionally during reads. Since we don't expect this to have a large # of reads or writes, we need to schedule a thread to call {{Cache#cleanUp}} periodically see https://github.com/google/guava/wiki/CachesExplained#when-does-cleanup-happen for details. (2) It would be ideal if we could use {{expireAfterAccess}} as well as {{softValues}} (not sure if the Guava cache allows this). We can set an expiration of say 30 seconds. This is beneficial in the case where Spark tasks perform in lock step, and all start and end at the same time. If this happens, then {{softValues}} might evict the hash table as soon as each batch of Spark tasks has completed. Adding a 30 seconds delay to eviction should allow enough time for the next batch of Spark tasks to be scheduled. > Create better cache eviction policy for SmallTableCache > ------------------------------------------------------- > > Key: HIVE-20440 > URL: https://issues.apache.org/jira/browse/HIVE-20440 > Project: Hive > Issue Type: Improvement > Components: Spark > Reporter: Antal Sinkovits > Assignee: Antal Sinkovits > Priority: Major > Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, > HIVE-20440.03.patch, HIVE-20440.04.patch > > > Enhance the SmallTableCache, to use guava cache with soft references, so that > we evict when there is memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)