[ https://issues.apache.org/jira/browse/HIVE-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110770#comment-14110770 ]
Damien Carol commented on HIVE-7826: ------------------------------------ [~hagleitn] I'm really interested by this feature. We use heavily partitioning in my company. I wish help you to test it. How can I help you? > Dynamic partition pruning on Tez > -------------------------------- > > Key: HIVE-7826 > URL: https://issues.apache.org/jira/browse/HIVE-7826 > Project: Hive > Issue Type: Bug > Reporter: Gunther Hagleitner > Assignee: Gunther Hagleitner > Labels: TODOC14, tez > Attachments: HIVE-7826.1.patch, HIVE-7826.2.patch, HIVE-7826.3.patch > > > It's natural in a star schema to map one or more dimensions to partition > columns. Time or location are likely candidates. > It can also useful to be to compute the partitions one would like to scan via > a subquery (where p in select ... from ...). > The resulting joins in hive require a full table scan of the large table > though, because partition pruning takes place before the corresponding values > are known. > On Tez it's relatively straight forward to send the values needed to prune to > the application master - where splits are generated and tasks are submitted. > Using these values we can strip out any unneeded partitions dynamically, > while the query is running. > The approach is straight forward: > - Insert synthetic conditions for each join representing "x in (keys of other > side in join)" > - This conditions will be pushed as far down as possible > - If the condition hits a table scan and the column involved is a partition > column: > - Setup Operator to send key events to AM > - else: > - Remove synthetic predicate -- This message was sent by Atlassian JIRA (v6.2#6252)