[ 
https://issues.apache.org/jira/browse/DRILL-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308449#comment-14308449
 ] 

Jason Altekruse commented on DRILL-2173:
----------------------------------------

https://reviews.apache.org/r/30701/


> Enable querying partition information without reading all data
> --------------------------------------------------------------
>
>                 Key: DRILL-2173
>                 URL: https://issues.apache.org/jira/browse/DRILL-2173
>             Project: Apache Drill
>          Issue Type: New Feature
>          Components: Query Planning & Optimization
>    Affects Versions: 0.7.0
>            Reporter: Jason Altekruse
>            Assignee: Jason Altekruse
>         Attachments: Drill-2173-partition-queries-with-pruning.patch
>
>
> When reading a series of files in nested directories, Drill currently adds 
> columns representing the directory structure that was traversed to reach the 
> file currently being read. These columns are stored as varchar under tha 
> names dir0, dir1, ...  As these are just regular columns, Drill allows 
> arbitrary queries against this data, in terms of aggregates, filter, sort, 
> etc. To allow optimizing reads, basic partition pruning has already been 
> added to prune in the case of an expression like dir0 = "2015" or a simple in 
> list, which is converted during planning to a series of ORs of equals 
> expressions. If users want to query the directory information dynamically, 
> and not include specific directory names in the query, this will prompt a 
> full table scan and filter operation on the dir columns. This enhancement is 
> to allow more complex queries to be run against directory metadata, and only 
> scanning the matching directories.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to