[ 
https://issues.apache.org/jira/browse/HIVE-18264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388241#comment-16388241
 ] 

Vaibhav Gumashta commented on HIVE-18264:
-----------------------------------------

[~akolb] Thanks for your review and sorry for late response, I was away for few 
days.
Regarding this patch:
This patch does not add anything new to the existing feature. It reorganizes 
the old cache structure to one which will be more efficient with concurrent 
access. Let me add more details in the description.

Regarding your questions on the feature itself:
This is an optional feature (not enabled by default) targeted for users whose 
metadata can fit the main memory (or those who decide to use larger memory by 
choice). This is particularly useful for cloud databases which tend to be 
slower and hence query compilation takes much longer in cases where one 
metastore call results in several rdbms queries. There is also an option to use 
whitelist/blacklist mechanism to prevent loading some tables in the cache to 
limit the memory footprint. 

Let me add a document to the wiki which details the current feature work which 
has been done and some of the future improvements we were planning to add. I'll 
also move the related jiras under an umbrella and add the doc there. Let me 
know if that works. Thanks again for taking a look.

> CachedStore: Store cached partitions/col stats within the table cache and 
> make prewarm non-blocking
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-18264
>                 URL: https://issues.apache.org/jira/browse/HIVE-18264
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Vaibhav Gumashta
>            Assignee: Vaibhav Gumashta
>            Priority: Major
>         Attachments: HIVE-18264.1.patch, HIVE-18264.2.patch, 
> HIVE-18264.3.patch, HIVE-18264.4.patch, HIVE-18264.5.patch
>
>
> Currently we have a separate cache for partitions and partition col stats 
> which results in some calls iterating through each of these for 
> retrieving/updating. We can get better performance by organizing 
> hierarchically. We should also make prewarm non-blocking



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to