Sudheesh Katkam created DRILL-3921: -------------------------------------- Summary: Hive LIMIT 1 queries takes too long Key: DRILL-3921 URL: https://issues.apache.org/jira/browse/DRILL-3921 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Reporter: Sudheesh Katkam Assignee: Sudheesh Katkam
Fragment initialization on a Hive table (that is backed by a directory of many files) can take really long. This is evident through LIMIT 1 queries. The root cause is that the underlying reader in the HiveRecordReader is initialized when the ctor is called, rather than when setup is called. Two changes need to be made: 1) lazily initialize the underlying record reader in HiveRecordReader 2) allow for running a callable as a proxy user within an operator (through OperatorContext). This is required as initialization of the underlying record reader needs to be done as a proxy user (proxy for owner of the file). Previously, this was handled while creating the record batch tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)