[jira] [Commented] (HIVE-18705) Improve HiveMetaStoreClient.dropDatabase

Vihang Karajgaonkar (JIRA) Wed, 11 Jul 2018 17:34:12 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-18705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540934#comment-16540934
 ]


Vihang Karajgaonkar commented on HIVE-18705:
--------------------------------------------

Hi [~szita] Sorry for taking long to respond. I was on vacation last week. I 
have a (probably dumb :)) question. Where does the performance improvement come 
from in this patch. I see you are reducing the number of getTable calls by 
using batch iterator, but for "large" databases we still do a drop table 
one-by-one and then do the dropDatabase. Am I missing something?

> Improve HiveMetaStoreClient.dropDatabase
> ----------------------------------------
>
>                 Key: HIVE-18705
>                 URL: https://issues.apache.org/jira/browse/HIVE-18705
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Adam Szita
>            Assignee: Adam Szita
>            Priority: Major
>         Attachments: HIVE-18705.0.patch, HIVE-18705.1.patch, 
> HIVE-18705.2.patch, HIVE-18705.4.patch, HIVE-18705.5.patch, 
> HIVE-18705.6.patch, HIVE-18705.7.patch, HIVE-18705.8.patch
>
>
> {{HiveMetaStoreClient.dropDatabase}} has a strange implementation to ensure 
> dealing with client side hooks (for non-native tables e.g. HBase). Currently 
> it starts by retrieving all the tables from HMS, and then sends {{dropTable}} 
> calls to HMS table-by-table. At the end a {{dropDatabase}} just to be sure :) 
> I believe this could be refactored so that it speeds up the dropDB in 
> situations where the average table count per DB is very high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18705) Improve HiveMetaStoreClient.dropDatabase

Reply via email to