Re: Review Request 71040: HIVE-21923 Vectorized MapJoin may miss results when only the join key is selected

2019-07-11 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71040/
---

(Updated July 11, 2019, 9:48 a.m.)


Review request for hive and Jesús Camacho Rodríguez.


Changes
---

patch#7


Bugs: HIVE-21923
https://issues.apache.org/jira/browse/HIVE-21923


Repository: hive-git


Description
---

HIVE-21923
Vectorized MapJoin may miss results when only the join key is selected


Diffs (updated)
-

  
common/src/test/org/apache/hadoop/hive/common/format/datetime/package-info.java 
70ee4266f4 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyGenerateResultOperator.java
 35db84 
  ql/src/test/queries/clientpositive/hybridgrace_hashjoin_2.q d989ca7dc8 
  ql/src/test/results/clientpositive/llap/correlationoptimizer4.q.out 
45a646c948 
  ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out 
1ddc1ea1ec 
  ql/src/test/results/clientpositive/spark/auto_join14.q.out 0c80c13889 
  ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 
4ee669fa7d 
  ql/src/test/results/clientpositive/tez/hybridgrace_hashjoin_2.q.out 
8e9bd0513e 


Diff: https://reviews.apache.org/r/71040/diff/2/

Changes: https://reviews.apache.org/r/71040/diff/1-2/


Testing
---


Thanks,

Zoltan Haindrich



Re: Review Request 71040: HIVE-21923 Vectorized MapJoin may miss results when only the join key is selected

2019-07-11 Thread Zoltan Haindrich


> On July 11, 2019, 4 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyGenerateResultOperator.java
> > Line 255 (original), 255 (patched)
> > 
> >
> > Can we update this comment since it is not only the big table? Feel 
> > free to add any more info to understand better what is going on.

I've removed the bigtable keyword...I don't think extending it will help.
I feel that redesigning/reducing the 3-4 mapping things to 1 would make it 
easier to undetstand this codes; and that would also avoid bugs like this.
Most importantly the part which puzzles together these mappings are hard to 
follow - and I think the problem arised from that cause.


> On July 11, 2019, 4 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/test/queries/clientpositive/hybridgrace_hashjoin_2.q
> > Line 10 (original), 9 (patched)
> > 
> >
> > Why is this disabled now? This is causing map join conversion to not 
> > being triggered below.

oh damn...I was making a final check that it's working correctly - looks like 
I've commited this...fixed; and all the other joins are mapjoins again


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71040/#review216516
---


On July 9, 2019, 6:12 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71040/
> ---
> 
> (Updated July 9, 2019, 6:12 p.m.)
> 
> 
> Review request for hive and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-21923
> https://issues.apache.org/jira/browse/HIVE-21923
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21923
> Vectorized MapJoin may miss results when only the join key is selected
> 
> 
> Diffs
> -
> 
>   
> common/src/test/org/apache/hadoop/hive/common/format/datetime/package-info.java
>  70ee4266f45219fd81bf0d0df0a2c4380334e307 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyGenerateResultOperator.java
>  35db844f236f24d2f17f4a43d064c9ebaf8c 
>   ql/src/test/queries/clientpositive/hybridgrace_hashjoin_2.q 
> d989ca7dc883fa071cf5772f358c68bff78f659f 
>   ql/src/test/results/clientpositive/llap/correlationoptimizer4.q.out 
> 45a646c948ec8b72710a6b8a3949fbe0203dd68e 
>   ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out 
> 2305f87e45bd65152a6c77ce04f7b8efad4724d7 
>   ql/src/test/results/clientpositive/spark/auto_join14.q.out 
> 0c80c13889d134abe82bde30c98300620b1fd432 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 
> 4ee669fa7dd50e0373910030b35c8860383a3a70 
>   ql/src/test/results/clientpositive/tez/hybridgrace_hashjoin_2.q.out 
> e28b15044503ea4bb5bd12b7caed6b105f337efd 
> 
> 
> Diff: https://reviews.apache.org/r/71040/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>



Re: Review Request 71040: HIVE-21923 Vectorized MapJoin may miss results when only the join key is selected

2019-07-10 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71040/#review216516
---




ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyGenerateResultOperator.java
Line 255 (original), 255 (patched)


Can we update this comment since it is not only the big table? Feel free to 
add any more info to understand better what is going on.



ql/src/test/queries/clientpositive/hybridgrace_hashjoin_2.q
Line 10 (original), 9 (patched)


Why is this disabled now? This is causing map join conversion to not being 
triggered below.



ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out
Line 83 (original), 84 (patched)


Map Join conversion not being triggered.



ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out
Line 241 (original), 255 (patched)


Map Join conversion not being triggered.



ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out
Line 5149 (original), 5149 (patched)


Cool.


- Jesús Camacho Rodríguez


On July 9, 2019, 4:12 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71040/
> ---
> 
> (Updated July 9, 2019, 4:12 p.m.)
> 
> 
> Review request for hive and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-21923
> https://issues.apache.org/jira/browse/HIVE-21923
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21923
> Vectorized MapJoin may miss results when only the join key is selected
> 
> 
> Diffs
> -
> 
>   
> common/src/test/org/apache/hadoop/hive/common/format/datetime/package-info.java
>  70ee4266f45219fd81bf0d0df0a2c4380334e307 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyGenerateResultOperator.java
>  35db844f236f24d2f17f4a43d064c9ebaf8c 
>   ql/src/test/queries/clientpositive/hybridgrace_hashjoin_2.q 
> d989ca7dc883fa071cf5772f358c68bff78f659f 
>   ql/src/test/results/clientpositive/llap/correlationoptimizer4.q.out 
> 45a646c948ec8b72710a6b8a3949fbe0203dd68e 
>   ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out 
> 2305f87e45bd65152a6c77ce04f7b8efad4724d7 
>   ql/src/test/results/clientpositive/spark/auto_join14.q.out 
> 0c80c13889d134abe82bde30c98300620b1fd432 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 
> 4ee669fa7dd50e0373910030b35c8860383a3a70 
>   ql/src/test/results/clientpositive/tez/hybridgrace_hashjoin_2.q.out 
> e28b15044503ea4bb5bd12b7caed6b105f337efd 
> 
> 
> Diff: https://reviews.apache.org/r/71040/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>



Review Request 71040: HIVE-21923 Vectorized MapJoin may miss results when only the join key is selected

2019-07-09 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71040/
---

Review request for hive and Jesús Camacho Rodríguez.


Bugs: HIVE-21923
https://issues.apache.org/jira/browse/HIVE-21923


Repository: hive-git


Description
---

HIVE-21923
Vectorized MapJoin may miss results when only the join key is selected


Diffs
-

  
common/src/test/org/apache/hadoop/hive/common/format/datetime/package-info.java 
70ee4266f45219fd81bf0d0df0a2c4380334e307 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyGenerateResultOperator.java
 35db844f236f24d2f17f4a43d064c9ebaf8c 
  ql/src/test/queries/clientpositive/hybridgrace_hashjoin_2.q 
d989ca7dc883fa071cf5772f358c68bff78f659f 
  ql/src/test/results/clientpositive/llap/correlationoptimizer4.q.out 
45a646c948ec8b72710a6b8a3949fbe0203dd68e 
  ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out 
2305f87e45bd65152a6c77ce04f7b8efad4724d7 
  ql/src/test/results/clientpositive/spark/auto_join14.q.out 
0c80c13889d134abe82bde30c98300620b1fd432 
  ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 
4ee669fa7dd50e0373910030b35c8860383a3a70 
  ql/src/test/results/clientpositive/tez/hybridgrace_hashjoin_2.q.out 
e28b15044503ea4bb5bd12b7caed6b105f337efd 


Diff: https://reviews.apache.org/r/71040/diff/1/


Testing
---


Thanks,

Zoltan Haindrich