Re: Review Request 68124: HIVE-20252

2018-08-01 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/#review206752
---


Ship it!




Ship It!

- Jesús Camacho Rodríguez


On Aug. 1, 2018, 6:10 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68124/
> ---
> 
> (Updated Aug. 1, 2018, 6:10 p.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez and Jason Dere.
> 
> 
> Bugs: HIVE-20252
> https://issues.apache.org/jira/browse/HIVE-20252
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See Jira.
> 
> removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. 
> I will eventually remove it and can be ignored.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 7b2ae40107 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 
> 
> 
> Diff: https://reviews.apache.org/r/68124/diff/5/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 68124: HIVE-20252

2018-08-01 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/
---

(Updated Aug. 1, 2018, 6:10 p.m.)


Review request for hive, Jesús Camacho Rodríguez and Jason Dere.


Changes
---

Left out a minor change from previous patch.


Bugs: HIVE-20252
https://issues.apache.org/jira/browse/HIVE-20252


Repository: hive-git


Description
---

See Jira.

removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. I 
will eventually remove it and can be ignored.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 7b2ae40107 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 


Diff: https://reviews.apache.org/r/68124/diff/5/

Changes: https://reviews.apache.org/r/68124/diff/4-5/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 68124: HIVE-20252

2018-08-01 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/
---

(Updated Aug. 1, 2018, 6:05 p.m.)


Review request for hive, Jesús Camacho Rodríguez and Jason Dere.


Changes
---

Implemented review comments.


Bugs: HIVE-20252
https://issues.apache.org/jira/browse/HIVE-20252


Repository: hive-git


Description
---

See Jira.

removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. I 
will eventually remove it and can be ignored.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 7b2ae40107 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 


Diff: https://reviews.apache.org/r/68124/diff/4/

Changes: https://reviews.apache.org/r/68124/diff/3-4/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 68124: HIVE-20252

2018-08-01 Thread Deepak Jaiswal


> On Aug. 1, 2018, 2:39 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java
> > Lines 451 (patched)
> > 
> >
> > We can remove this first block, it does not buy us much in terms of 
> > algorithm perfomance, and method would have no restriction on start 
> > operator (plus more readable).
> 
> Deepak Jaiswal wrote:
> No. It wont work without it. It is not for performance, it is for 
> correctness. The start in our case is the RS2, going up wont work as it will 
> stop when it encounters RS1.
> The more generic one is in SharedWorkOptimizer, this one, I am afraid is 
> for this particular case.
> 
> Jesús Camacho Rodríguez wrote:
> The block can be part of the caller logic, so if you have the chain:
> SEL->GBY1->RS1->GBY2->RS2
> then you end up passing the SEL as the start operator.
> 
> 
> Then the method in OperatorUtils has no restriction and it is reusable: 
> given any operator, 1) output all the operators contained in the same work, 
> 2) gather all the terminal operators of that work, and 3) gather all the 
> semijoin branches of that work.

Thanks.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/#review206717
---


On Aug. 1, 2018, 12:27 a.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68124/
> ---
> 
> (Updated Aug. 1, 2018, 12:27 a.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez and Jason Dere.
> 
> 
> Bugs: HIVE-20252
> https://issues.apache.org/jira/browse/HIVE-20252
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See Jira.
> 
> removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. 
> I will eventually remove it and can be ignored.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 7b2ae40107 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 
> 
> 
> Diff: https://reviews.apache.org/r/68124/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 68124: HIVE-20252

2018-08-01 Thread Jesús Camacho Rodríguez


> On Aug. 1, 2018, 2:39 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java
> > Lines 451 (patched)
> > 
> >
> > We can remove this first block, it does not buy us much in terms of 
> > algorithm perfomance, and method would have no restriction on start 
> > operator (plus more readable).
> 
> Deepak Jaiswal wrote:
> No. It wont work without it. It is not for performance, it is for 
> correctness. The start in our case is the RS2, going up wont work as it will 
> stop when it encounters RS1.
> The more generic one is in SharedWorkOptimizer, this one, I am afraid is 
> for this particular case.

The block can be part of the caller logic, so if you have the chain:
SEL->GBY1->RS1->GBY2->RS2
then you end up passing the SEL as the start operator.


Then the method in OperatorUtils has no restriction and it is reusable: given 
any operator, 1) output all the operators contained in the same work, 2) gather 
all the terminal operators of that work, and 3) gather all the semijoin 
branches of that work.


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/#review206717
---


On Aug. 1, 2018, 12:27 a.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68124/
> ---
> 
> (Updated Aug. 1, 2018, 12:27 a.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez and Jason Dere.
> 
> 
> Bugs: HIVE-20252
> https://issues.apache.org/jira/browse/HIVE-20252
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See Jira.
> 
> removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. 
> I will eventually remove it and can be ignored.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 7b2ae40107 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 
> 
> 
> Diff: https://reviews.apache.org/r/68124/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 68124: HIVE-20252

2018-08-01 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/#review206743
---




ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java
Lines 451 (patched)


The block can be part of the caller logic, so if you have the chain:
SEL->GBY1->RS1->GBY2->RS2
then you end up passing the SEL as the start operator.

Then the method in OperatorUtils has no restriction and it is reusable: 
given any operator, 1) output all the operators contained in the same work, 2) 
gather all the terminal operators of that work, and 3) gather all the semijoin 
branches of that work.


- Jesús Camacho Rodríguez


On Aug. 1, 2018, 12:27 a.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68124/
> ---
> 
> (Updated Aug. 1, 2018, 12:27 a.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez and Jason Dere.
> 
> 
> Bugs: HIVE-20252
> https://issues.apache.org/jira/browse/HIVE-20252
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See Jira.
> 
> removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. 
> I will eventually remove it and can be ignored.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 7b2ae40107 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 
> 
> 
> Diff: https://reviews.apache.org/r/68124/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 68124: HIVE-20252

2018-08-01 Thread Deepak Jaiswal


> On Aug. 1, 2018, 2:39 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java
> > Lines 451 (patched)
> > 
> >
> > We can remove this first block, it does not buy us much in terms of 
> > algorithm perfomance, and method would have no restriction on start 
> > operator (plus more readable).

No. It wont work without it. It is not for performance, it is for correctness. 
The start in our case is the RS2, going up wont work as it will stop when it 
encounters RS1.
The more generic one is in SharedWorkOptimizer, this one, I am afraid is for 
this particular case.


> On Aug. 1, 2018, 2:39 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java
> > Lines 462 (patched)
> > 
> >
> > Probably more useful to do the inverse, the private method void and the 
> > public method returns the operators in the work?

Aah, what was I thinking. I meant to do that only. Thanks for pointing this out.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/#review206717
---


On Aug. 1, 2018, 12:27 a.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68124/
> ---
> 
> (Updated Aug. 1, 2018, 12:27 a.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez and Jason Dere.
> 
> 
> Bugs: HIVE-20252
> https://issues.apache.org/jira/browse/HIVE-20252
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See Jira.
> 
> removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. 
> I will eventually remove it and can be ignored.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 7b2ae40107 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 
> 
> 
> Diff: https://reviews.apache.org/r/68124/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 68124: HIVE-20252

2018-07-31 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/#review206717
---




ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java
Lines 451 (patched)


We can remove this first block, it does not buy us much in terms of 
algorithm perfomance, and method would have no restriction on start operator 
(plus more readable).



ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java
Lines 462 (patched)


Probably more useful to do the inverse, the private method void and the 
public method returns the operators in the work?


- Jesús Camacho Rodríguez


On Aug. 1, 2018, 12:27 a.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68124/
> ---
> 
> (Updated Aug. 1, 2018, 12:27 a.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez and Jason Dere.
> 
> 
> Bugs: HIVE-20252
> https://issues.apache.org/jira/browse/HIVE-20252
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See Jira.
> 
> removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. 
> I will eventually remove it and can be ignored.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 7b2ae40107 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 
> 
> 
> Diff: https://reviews.apache.org/r/68124/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 68124: HIVE-20252

2018-07-31 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/
---

(Updated Aug. 1, 2018, 12:27 a.m.)


Review request for hive, Jesús Camacho Rodríguez and Jason Dere.


Changes
---

Implemented review comments.


Bugs: HIVE-20252
https://issues.apache.org/jira/browse/HIVE-20252


Repository: hive-git


Description
---

See Jira.

removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. I 
will eventually remove it and can be ignored.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 7b2ae40107 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 


Diff: https://reviews.apache.org/r/68124/diff/3/

Changes: https://reviews.apache.org/r/68124/diff/2-3/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 68124: HIVE-20252

2018-07-31 Thread Deepak Jaiswal


> On July 31, 2018, 11:38 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
> > Line 914 (original), 917 (patched)
> > 
> >
> > Can be collapsed into single line in if condition.

I have been asked to not do that in other reviews before so I kept it that way. 
Lets keep it this way.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/#review206707
---


On July 31, 2018, 11:07 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68124/
> ---
> 
> (Updated July 31, 2018, 11:07 p.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez and Jason Dere.
> 
> 
> Bugs: HIVE-20252
> https://issues.apache.org/jira/browse/HIVE-20252
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See Jira.
> 
> removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. 
> I will eventually remove it and can be ignored.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 
> 
> 
> Diff: https://reviews.apache.org/r/68124/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 68124: HIVE-20252

2018-07-31 Thread Deepak Jaiswal


> On July 31, 2018, 11:38 p.m., Jesús Camacho Rodríguez wrote:
> >

Thanks I will work on the comments.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/#review206707
---


On July 31, 2018, 11:07 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68124/
> ---
> 
> (Updated July 31, 2018, 11:07 p.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez and Jason Dere.
> 
> 
> Bugs: HIVE-20252
> https://issues.apache.org/jira/browse/HIVE-20252
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See Jira.
> 
> removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. 
> I will eventually remove it and can be ignored.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 
> 
> 
> Diff: https://reviews.apache.org/r/68124/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 68124: HIVE-20252

2018-07-31 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/#review206707
---




ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Line 846 (original), 841 (patched)


Can we 1) move this method to OperatorUtils, and 2) keep this method 
private, and 3) create a void public entry method for the recursion, where we 
give the _found_ set to the private one, then return the _found_ set?



ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Line 865 (original)


Cool!



ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Line 914 (original), 917 (patched)


Can be collapsed into single line in if condition.



ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Line 918 (original), 922 (patched)


_workRSOps_ and _workTerminalOps_



ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Line 920 (original), 924 (patched)


No need to pass empty set, see comment above.



ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Line 934 (original)


Loop on _candidate_ here to populate _terminalOpToRSMap_.
Then you can get rid of the 3-level nested loops below.


- Jesús Camacho Rodríguez


On July 31, 2018, 11:07 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68124/
> ---
> 
> (Updated July 31, 2018, 11:07 p.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez and Jason Dere.
> 
> 
> Bugs: HIVE-20252
> https://issues.apache.org/jira/browse/HIVE-20252
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See Jira.
> 
> removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. 
> I will eventually remove it and can be ignored.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 
> 
> 
> Diff: https://reviews.apache.org/r/68124/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 68124: HIVE-20252

2018-07-31 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/
---

(Updated July 31, 2018, 11:07 p.m.)


Review request for hive, Jesús Camacho Rodríguez and Jason Dere.


Changes
---

New approach where a virtual edge is created from non-semijoin terminal 
operators in a task to semijoin terminal operators within the task.
This creates a cycle if there exists a task level cycle.


Bugs: HIVE-20252
https://issues.apache.org/jira/browse/HIVE-20252


Repository: hive-git


Description
---

See Jira.

removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. I 
will eventually remove it and can be ignored.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 538aa5e924 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 


Diff: https://reviews.apache.org/r/68124/diff/2/

Changes: https://reviews.apache.org/r/68124/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



Review Request 68124: HIVE-20252

2018-07-30 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68124/
---

Review request for hive, Jesús Camacho Rodríguez and Jason Dere.


Bugs: HIVE-20252
https://issues.apache.org/jira/browse/HIVE-20252


Repository: hive-git


Description
---

See Jira.

removeSemiJoinCyclesDueToMapsideJoins is deprecated, although it has changes. I 
will eventually remove it and can be ignored.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
011dadf495 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java c3eb886fd2 


Diff: https://reviews.apache.org/r/68124/diff/1/


Testing
---


Thanks,

Deepak Jaiswal