Amareshwari Sriramadasu created HIVE-4956:
---------------------------------------------

             Summary: Allow multiple tables in from clause if all them have the 
same schema, but can be partitioned differently
                 Key: HIVE-4956
                 URL: https://issues.apache.org/jira/browse/HIVE-4956
             Project: Hive
          Issue Type: Improvement
          Components: Query Processor
            Reporter: Amareshwari Sriramadasu
            Assignee: Amareshwari Sriramadasu


We have a usecase where the table storage partitioning changes over time.

For ex:
 we can have a table T1 which is partitioned by p1. But overtime, we want to 
partition the table on p1 and p2 as well. The new table can be T2. So, if we 
have to query table on partition p1, it will be a union query across two table 
T1 and T2. Especially with aggregations like avg, it becomes costly union query 
because we cannot make use of mapside aggregations and other optimizations.

The proposal is to support queries of the following format :

select t.x, t.y, .... from T1,T2 t where t.p1='x' OR t.p1='y' ... 
[groupby-clause] [having-clause] [orderby-clause] and so on.

Here we allow from clause as a comma separated list of tables with an alias and 
alias will be used in the full query, and partition pruning will happen on the 
actual tables to pick up the right paths. This will work because the difference 
is only on picking up the input paths and whole operator tree does not change. 
If this sounds a good usecase, I can put up the changes required to support the 
same.





--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to