Thomas Bünger created DRILL-5216: ------------------------------------ Summary: Set FetchSize to Speed up Metadata retrieval for JDBC storage plugin over high latency connections Key: DRILL-5216 URL: https://issues.apache.org/jira/browse/DRILL-5216 Project: Apache Drill Issue Type: Improvement Components: Storage - JDBC Affects Versions: 1.9.0 Environment: drill-embedded on ubuntu client - connected to a remote Oracle Reporter: Thomas Bünger Priority: Minor
The metadata retrieval uses the default fetchsize for the underlying JDBC driver, which in case of Oracle is only 10. In larger scenarios - as in mine - the Oracle cluster hosts thousands of schemas and the small fetchsize results in hundres of individual roundtrips. In the end every Drill query against this storage takes at least a minute (server is remote) So far, Drill is using the JDBC metadata API {{java.sql.DatabaseMetaData.getSchemas()}} inside JdbcStoragePlugin.java and could set an appropriate fetchsize before iterating the result set. I've tested this locally and improved latency a lot, but am note sure how this affects other non-oracle JDBC drivers. The other (potentially long) query is the table enumeration. >From what I've seen is Drill not calling the JDBC driver directly, but goes >through apache.calcite calling {{getTableNames()}} which under the hood calls >{{java.sql.DatabaseMetaData.getTables()}} and also contributes to slow >metadata retrieval due to small default fetch size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)