[ https://issues.apache.org/jira/browse/IMPALA-8606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on IMPALA-8606 started by Quanlong Huang. ---------------------------------------------- > GET_TABLES performance in local catalog mode > -------------------------------------------- > > Key: IMPALA-8606 > URL: https://issues.apache.org/jira/browse/IMPALA-8606 > Project: IMPALA > Issue Type: Bug > Components: Catalog > Affects Versions: Impala 3.2.0 > Reporter: Balazs Jeszenszky > Assignee: Quanlong Huang > Priority: Blocker > Labels: catalog-v2 > > With local catalog mode enabled, GET_TABLES JDBC requests will return more > than the always available table information. Any request for more metadata > about a table will trigger a full load of that table on the catalogd side, > meaning that GET_TABLES triggers the load of the entire catalog. Also, as far > as I can see, the requests for more metadata are made one table at a time. > Once the tables are loaded on the catalogd-side, a coordinator needs 3 > roundtrips to the catalog to fetch all the details about a single table. My > test case had around 57k tables, 1700 DBs, and ~120k partitions. > GET_TABLES on a cold catalog takes 18 minutes. With a warm catalog, but cold > impalad, it still takes ~70 seconds. > Many tools use GET_TABLES to populate dropdowns, etc. so this is bad for both > end user experience and catalog memory usage. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org