[ https://issues.apache.org/jira/browse/DRILL-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267726#comment-15267726 ]
Sean Hsuan-Yi Chu commented on DRILL-4577: ------------------------------------------ [~vkorukanti], Data points regarding Performance: 1. 1k tables in the DB: Without this patch, it took around 550 seconds; With this patch, it took about 3.7 - 4.3 seconds; 2. 32k tables in the DB: Without this patch, the result does not come back. With this patch, it took about 83.1 seconds; With the points here, I think this patch, along with the option, is really needed. > Improve performance for query on INFORMATION_SCHEMA when HIVE is plugged in > --------------------------------------------------------------------------- > > Key: DRILL-4577 > URL: https://issues.apache.org/jira/browse/DRILL-4577 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive > Reporter: Sean Hsuan-Yi Chu > Assignee: Sean Hsuan-Yi Chu > Fix For: 1.7.0 > > > A query such as > {code} > select * from INFORMATION_SCHEMA.`TABLES` > {code} > is converted as calls to fetch all tables from storage plugins. > When users have Hive, the calls to hive metadata storage would be: > 1) get_table > 2) get_partitions > However, the information regarding partitions is not used in this type of > queries. Beside, a more efficient way is to fetch tables is to use > get_multi_table call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)