[ 
https://issues.apache.org/jira/browse/HIVE-27260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taraka Rama Rao Lethavadla updated HIVE-27260:
----------------------------------------------
    Description: 
*get_all_token_identifiers* is retrieving all the entries in table 
DELEGATION_TOKENS at once.

In systems where the total number of rows is very high causes OOM in HMS while 
retrieving all the rows at once.

Can we have batching kind of mechanism while retrieving rows from that table?

How about we add timestamp column to that table so that old entries can be 
cleaned up manually incase going via cleaner did not help?

Even expiry token removal thread is also using the same api 
*get_all_token_identifiers* and it becomes difficult for cleaner once the 
number of rows reaches higher number

 

There is a feature https://issues.apache.org/jira/browse/HIVE-17609 but that is 
also using *get_all_token_identifiers,* so it will also lead to the same issue

  was:
*get_all_token_identifiers* is retrieving all the entries in table 
DELEGATION_TOKENS at once.

In systems where the total number of rows is very high causes OOM in HMS while 
retrieving all the rows at once.

Can we have batching kind of mechanism while retrieving rows from that table?

How about we add timestamp column to that table so that old entries can be 
cleaned up manually incase going via cleaner did not help?

Even expiry token removal thread is also using the same api 
*get_all_token_identifiers* and it becomes difficult for cleaner once the 
number of rows reaches higher number


> OOM while retrieving delegation tokens using get_all_token_identifiers call
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-27260
>                 URL: https://issues.apache.org/jira/browse/HIVE-27260
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Standalone Metastore
>            Reporter: Taraka Rama Rao Lethavadla
>            Priority: Major
>
> *get_all_token_identifiers* is retrieving all the entries in table 
> DELEGATION_TOKENS at once.
> In systems where the total number of rows is very high causes OOM in HMS 
> while retrieving all the rows at once.
> Can we have batching kind of mechanism while retrieving rows from that table?
> How about we add timestamp column to that table so that old entries can be 
> cleaned up manually incase going via cleaner did not help?
> Even expiry token removal thread is also using the same api 
> *get_all_token_identifiers* and it becomes difficult for cleaner once the 
> number of rows reaches higher number
>  
> There is a feature https://issues.apache.org/jira/browse/HIVE-17609 but that 
> is also using *get_all_token_identifiers,* so it will also lead to the same 
> issue



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to