-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41984/#review113327
-----------------------------------------------------------

Ship it!


Ship It!

- Sergio Pena


On Jan. 6, 2016, 4:52 p.m., Aihua Xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41984/
> -----------------------------------------------------------
> 
> (Updated Jan. 6, 2016, 4:52 p.m.)
> 
> 
> Review request for hive, Sergio Pena and Xuefu Zhang.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-12762: Common join on parquet tables returns incorrect result when 
> hive.optimize.index.filter set to true
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
> 9a7d990baaabfde8e564f00bb1fcfe30cd16dc90 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java 
> 017676bec2163f04fb95f43224a4f8743fa49f55 
>   ql/src/test/queries/clientpositive/parquet_join2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_join2.q.out PRE-CREATION 
>   storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/ExpressionTree.java 
> 577d95d1a15a54c2804349b3e5e68d83b72df664 
>   
> storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
>  eeff131cbc14d7ef554517109612ae7d891f8003 
> 
> Diff: https://reviews.apache.org/r/41984/diff/
> 
> 
> Testing
> -------
> 
> We have two issues: 1. We are filtering the parquet columns based on the last 
> filter condition in the query. So if the query contains multiple instances of 
> the same table, e.g., join on the same table with different filter 
> conditions, then we could get incorrect result; 2. rewriteLeaves 
> implementation in SearchArgumentImpl is not accurate since the different 
> leaves could be sharing the same object. The current implementation could 
> change the leave index multiple times to an incorrect value.
> 
> The patch will merge all the filter conditions (create OR expression on all 
> the filters) so that the columns which will be used during operator won't be 
> filtered during earlier splitting stage. rewriteLeaves is reimplemented to 
> get all the unique leaves first and replace in place.
> 
> 
> Thanks,
> 
> Aihua Xu
> 
>

Reply via email to