[ 
https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606353#comment-16606353
 ] 

Sahil Takiar commented on HIVE-20440:
-------------------------------------

Few high level comments:

(1) The Guava cache only performs cleanup on writes, and occasionally during 
reads. Since we don't expect this to have a large # of reads or writes, we need 
to schedule a thread to call {{Cache#cleanUp}} periodically see 
https://github.com/google/guava/wiki/CachesExplained#when-does-cleanup-happen 
for details.

(2) It would be ideal if we could use {{expireAfterAccess}} as well as 
{{softValues}} (not sure if the Guava cache allows this). We can set an 
expiration of say 30 seconds. This is beneficial in the case where Spark tasks 
perform in lock step, and all start and end at the same time. If this happens, 
then {{softValues}} might evict the hash table as soon as each batch of Spark 
tasks has completed. Adding a 30 seconds delay to eviction should allow enough 
time for the next batch of Spark tasks to be scheduled.

> Create better cache eviction policy for SmallTableCache
> -------------------------------------------------------
>
>                 Key: HIVE-20440
>                 URL: https://issues.apache.org/jira/browse/HIVE-20440
>             Project: Hive
>          Issue Type: Improvement
>          Components: Spark
>            Reporter: Antal Sinkovits
>            Assignee: Antal Sinkovits
>            Priority: Major
>         Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, 
> HIVE-20440.03.patch, HIVE-20440.04.patch
>
>
> Enhance the SmallTableCache, to use guava cache with soft references, so that 
> we evict when there is memory pressure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to