[ 
https://issues.apache.org/jira/browse/HIVE-28094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17822328#comment-17822328
 ] 

Soumyakanti Das commented on HIVE-28094:
----------------------------------------

After further testing, I found that currently we cannot uniquely identify a 
tableID with the fields of GetTableRequest. So with the current PR, we will run 
into issues if we have a table X, which we then drop and recreate with the same 
name but with an additional column. In this case, we will still get the tableID 
for the older table. Thus, I think the current implementation of only caching 
tableIDs in the query cache is the best we can do - we cannot cache it in the 
HS2 level cache.

I am not planning to work on this in the near future - but I may revisit this 
at a later point.

> Improve HMS client cache and query cache performance for getTableInternal
> -------------------------------------------------------------------------
>
>                 Key: HIVE-28094
>                 URL: https://issues.apache.org/jira/browse/HIVE-28094
>             Project: Hive
>          Issue Type: Improvement
>          Components: Hive
>    Affects Versions: 4.0.0-beta-1
>            Reporter: Soumyakanti Das
>            Assignee: Soumyakanti Das
>            Priority: Major
>              Labels: pull-request-available
>
> Currently we cache calls to {{getTableInternal}} method in HMS client cache 
> and query cache. We also cache table ids in the query cache, but not in the 
> HMS client cache.
>  
> To cache {{{}getTableInternal{}}}, we create a CacheKey containing the 
> {{GetTableRequest}} object. However, we do not check if all the necessary 
> fields are set in the key. This results in a lot of cache misses, especially 
> because we rely on {{validWriteIdList}} not being null and {{tableId}} not 
> being -1. {{GetTableRequest}} object also contains `catName` which is not 
> always set. All these things result in creating duplicate keys and not using 
> the caches efficiently.
>  
> Moreover, {{getTableInternal}} is called from other APIs that are getting 
> cached, e.g. {{{}getPartitionsByExprInternal{}}}, so improvements in its 
> performance will positively affect other APIs too.
>  
> *RESULTS:*
> I ran all TPCDS explain cbo queries on my local machine, after cherry-picking 
> [HIVE-28083: Enable HMS client cache and HMS query cache for Explain 
> plans|https://github.com/apache/hive/pull/5092/commits/41a766d6a51480edb505fd53661a03c63ef3937a].
>  Then I analyzed the logs with a simple python script to get min, 25th 
> percentile, median, 75th percentile, and max for PERFLOG logs with this 
> pattern:
> {code:java}
> </PERFLOG method=(\w+) start=\d+ end=\d+ duration=(\d+) from=.* HS2-cache>'
> {code}
> Here are the results.
> *WITHOUT the improvements to {{getTableInternal}} method:*
> |*API*|*MIN*|*25th*|*MEDIAN*|*75th*|*MAX*|
> |*getTable*|2|3|3|4|233|
> |*getTableConstraints*|2|4|4|5|22|
> |*getPartitionsByExpr*|19|22|25|27|2396|
> |*getAggrColStatsFor*|0|125.5|186|284|910|
> |*getTableColumnStatistics*|4|6|7|8|454|
> Cache Stats:
> {code:java}
> CacheStats{hitCount=77464, missCount=11919, loadSuccessCount=0, 
> loadFailureCount=0, totalLoadTime=0, evictionCount=0, evictionWeight=0} {code}
> *WITH the improvements to {{getTableInternal}} method:*
> |*API*|*MIN*|*25th*|*MEDIAN*|*75th*|*MAX*|
> |*getTable*|0|0|0|0|33|
> |*getTableConstraints*|3|4|4|5|20|
> |*getPartitionsByExpr*|14|16|19|21|2247|
> |*getAggrColStatsFor*|0|124.5|187|272.5|936|
> |*getTableColumnStatistics*|0|0|0|1|16|
> Cache Stats:
> {code:java}
> CacheStats{hitCount=81044, missCount=11943, loadSuccessCount=0, 
> loadFailureCount=0, totalLoadTime=0, evictionCount=0, evictionWeight=0} {code}
> We can see that latency for the APIs, and the cache {{hitCount}} improves 
> with this patch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to