[ https://issues.apache.org/jira/browse/DRILL-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308176#comment-14308176 ]
Jason Altekruse commented on DRILL-2173: ---------------------------------------- Patch must be applied on top of 2143 and then 2060 > Enable querying partition information without reading all data > -------------------------------------------------------------- > > Key: DRILL-2173 > URL: https://issues.apache.org/jira/browse/DRILL-2173 > Project: Apache Drill > Issue Type: New Feature > Components: Query Planning & Optimization > Affects Versions: 0.7.0 > Reporter: Jason Altekruse > Assignee: Jason Altekruse > Attachments: Drill-2173-partition-queries-with-pruning.patch > > > When reading a series of files in nested directories, Drill currently adds > columns representing the directory structure that was traversed to reach the > file currently being read. These columns are stored as varchar under tha > names dir0, dir1, ... As these are just regular columns, Drill allows > arbitrary queries against this data, in terms of aggregates, filter, sort, > etc. To allow optimizing reads, basic partition pruning has already been > added to prune in the case of an expression like dir0 = "2015" or a simple in > list, which is converted during planning to a series of ORs of equals > expressions. If users want to query the directory information dynamically, > and not include specific directory names in the query, this will prompt a > full table scan and filter operation on the dir columns. This enhancement is > to allow more complex queries to be run against directory metadata, and only > scanning the matching directories. -- This message was sent by Atlassian JIRA (v6.3.4#6332)