[ 
https://issues.apache.org/jira/browse/HIVE-28977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17979349#comment-17979349
 ] 

Zhihua Deng edited comment on HIVE-28977 at 6/17/25 3:11 AM:
-------------------------------------------------------------

+1 for making ExpiredTokenRemover running in the single HMS instance for db 
token.

For zk token, we can optimize it in the future if someone hit the issue


was (Author: dengzh):
+1 for making ExpiredTokenRemover running in the single HMS instance.

For ZooKeeperTokenStore I think we can optimize it in the future if someone hit 
the issue.

> Externalize the ExpiredTokenRemover to housekeeping threads
> -----------------------------------------------------------
>
>                 Key: HIVE-28977
>                 URL: https://issues.apache.org/jira/browse/HIVE-28977
>             Project: Hive
>          Issue Type: Improvement
>          Components: HiveServer2, Standalone Metastore
>    Affects Versions: 4.1.0, 4.0.1
>            Reporter: Miklos Szurap
>            Assignee: Miklos Szurap
>            Priority: Major
>              Labels: cleanup, delegationtoken, maintenance
>
> In many deployments there are multiple HS2 and HMS instances, and the 
> "hive.cluster.delegation.token.store.class" is configured to 
> "org.apache.hadoop.hive.thrift.DBTokenStore" which stores the DTs in the HMS, 
> at the end in the "DELEGATION_TOKENS" table.
> Currently (master / d6bcdf652d) the implementation of the token cleanup is 
> very inefficient:
> - All the HS2 and HMS instances start the DT cleanup thread, see "Starting 
> expired delegation token remover thread" in the logs.
> - This "ExpiredTokenRemover" thread actually renews the tokens (if not 
> expired), or removes them (if expired). This is fine. However it first 
> fetches ALL the delegation tokens (one 
> "tokenStore.getAllDelegationTokenIdentifiers()" call), and then iterates 
> through them to get their details (many "tokenStore.getToken(id)" calls). As 
> this is also done from the HS2 side, this creates lots of "remote" calls to 
> the HMS, which is very inefficient.
> We should optimize this and do it in the Metastore's housekeeping threads. 
> Ideally only one HMS is a leader (dynamic leader election) so it could be 
> done from a single place.
> Note that there can be many thousands of DTs stored in the DB depending on 
> the token lifetime configurations and usage patterns, we could spare lots of 
> cycles with this.
> Which DT stores are affected?
> - The MemoryTokenStore should be untouched, as it is indeed a "per instance" 
> store and the cleanup should run everywhere.
> - The ZooKeeperTokenStore can be individually configured 
> ("hive.cluster.delegation.token.store.zookeeper.znode"), so it is not safe to 
> do it from a single place
> - As such only the DBTokenStore can be optimized like this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to