[ 
https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nezih Yigitbasi reassigned HIVE-10319:
--------------------------------------

    Assignee: Nezih Yigitbasi

> Hive CLI startup takes a long time with a large number of databases
> -------------------------------------------------------------------
>
>                 Key: HIVE-10319
>                 URL: https://issues.apache.org/jira/browse/HIVE-10319
>             Project: Hive
>          Issue Type: Improvement
>          Components: CLI
>    Affects Versions: 1.0.0
>            Reporter: Nezih Yigitbasi
>            Assignee: Nezih Yigitbasi
>         Attachments: HIVE-10319.patch
>
>
> The Hive CLI takes a long time to start when there is a large number of 
> databases in the DW. I think the root cause is the way permanent UDFs are 
> loaded from the metastore. When I looked at the logs and the source code I 
> see that at startup Hive first gets all the databases from the metastore and 
> then for each database it makes a metastore call to get the permanent 
> functions for that database [see Hive.java | 
> https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185].
>  So the number of metastore calls made is in the order of the number of 
> databases. In production we have several hundreds of databases so Hive makes 
> several hundreds of RPC calls during startup, taking 30+ seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to