[ https://issues.apache.org/jira/browse/HIVE-18705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540934#comment-16540934 ]
Vihang Karajgaonkar commented on HIVE-18705: -------------------------------------------- Hi [~szita] Sorry for taking long to respond. I was on vacation last week. I have a (probably dumb :)) question. Where does the performance improvement come from in this patch. I see you are reducing the number of getTable calls by using batch iterator, but for "large" databases we still do a drop table one-by-one and then do the dropDatabase. Am I missing something? > Improve HiveMetaStoreClient.dropDatabase > ---------------------------------------- > > Key: HIVE-18705 > URL: https://issues.apache.org/jira/browse/HIVE-18705 > Project: Hive > Issue Type: Improvement > Reporter: Adam Szita > Assignee: Adam Szita > Priority: Major > Attachments: HIVE-18705.0.patch, HIVE-18705.1.patch, > HIVE-18705.2.patch, HIVE-18705.4.patch, HIVE-18705.5.patch, > HIVE-18705.6.patch, HIVE-18705.7.patch, HIVE-18705.8.patch > > > {{HiveMetaStoreClient.dropDatabase}} has a strange implementation to ensure > dealing with client side hooks (for non-native tables e.g. HBase). Currently > it starts by retrieving all the tables from HMS, and then sends {{dropTable}} > calls to HMS table-by-table. At the end a {{dropDatabase}} just to be sure :) > I believe this could be refactored so that it speeds up the dropDB in > situations where the average table count per DB is very high. -- This message was sent by Atlassian JIRA (v7.6.3#76005)