Re: Review Request 20051: HIVE-4904: A little more CP crossing RS boundaries

2014-04-07 Thread Harish Butani


> On April 7, 2014, 6:03 p.m., John Pullokkaran wrote:
> > I took a look at this change; my knowledge of hive code is rather limited.
> > 1. Column Pruner doesn't cross Script operator boundary. Theoretically you 
> > could prune above and below the script op separately.
> > 2. It seems column pruner assumes that parent of UDTF is always select; but 
> > we haven't formalized this assumption. Other processors should throw 
> > exception if it ever come across a child that is UDTF. Theoretically you 
> > can push down certain filters below builtin UDTF. We may not be doing that 
> > today.
> > 3.  In Select Pruner it seems like there is no difference between 
> > 'prunedCols' and 'columns'.

Thanks John.Here are responses to your points

1. Column Pruner doesn't cross Script operator boundary.
  The ColumnPrunerWalker explicitly stops at the SelectOp parent of a ScriptOp. 
This may have been ok when developed; as you point out now it makes sense to 
continue pruning on the SelectOp ancestors. Can you file a jira for this.
2. The check in ColumnPrunerSelectProc is needed for the LVJoin case, where for 
the UDTFOp you end up with a empty PrunedList. What I realized was that Navis's 
fix doesn't cover the LVJoin case. Yes this should be revisited.   


- Harish


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20051/#review39706
---


On April 6, 2014, 1:33 a.m., Harish Butani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20051/
> ---
> 
> (Updated April 6, 2014, 1:33 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Navis Ryu.
> 
> 
> Bugs: HIVE-4904
> https://issues.apache.org/jira/browse/HIVE-4904
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Currently, CP context cannot be propagated over RS except for JOIN/EXT. A 
> little more CP is possible.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java 58a9b59 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcCtx.java 
> db36151 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
> 0690fb7 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 3f16dc2 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java
>  94224b3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 3b33dc2 
>   ql/src/test/queries/clientpositive/order_within_subquery.q PRE-CREATION 
>   ql/src/test/results/clientpositive/annotate_stats_select.q.out 1e982e6 
>   ql/src/test/results/clientpositive/auto_join18.q.out b8677f4 
>   ql/src/test/results/clientpositive/auto_join27.q.out a576190 
>   ql/src/test/results/clientpositive/auto_join30.q.out 8709198 
>   ql/src/test/results/clientpositive/auto_join31.q.out 1936e45 
>   ql/src/test/results/clientpositive/auto_join32.q.out 05f53e6 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out 8882aac 
>   ql/src/test/results/clientpositive/count.q.out eb048b6 
>   ql/src/test/results/clientpositive/distinct_stats.q.out f715ea3 
>   ql/src/test/results/clientpositive/groupby2_map.q.out 291f196 
>   ql/src/test/results/clientpositive/groupby2_map_skew.q.out d005b6c 
>   ql/src/test/results/clientpositive/groupby3_map.q.out 1dfee08 
>   ql/src/test/results/clientpositive/groupby3_map_skew.q.out 7af59bc 
>   ql/src/test/results/clientpositive/groupby_cube1.q.out 92d81f4 
>   ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out b405978 
>   ql/src/test/results/clientpositive/groupby_map_ppr.q.out 27eff75 
>   
> ql/src/test/results/clientpositive/groupby_multi_insert_common_distinct.q.out 
> ad76252 
>   ql/src/test/results/clientpositive/groupby_multi_single_reducer3.q.out 
> 51a70c4 
>   ql/src/test/results/clientpositive/groupby_position.q.out 727bccb 
>   ql/src/test/results/clientpositive/groupby_rollup1.q.out 36bf966 
>   ql/src/test/results/clientpositive/groupby_sort_11.q.out 8ee7571 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out a27cfaa 
>   ql/src/test/results/clientpositive/join18.q.out 7975c79 
>   ql/src/test/results/clientpositive/limit_pushdown.q.out 9c93ada 
>   ql/src/test/results/clientpositive/limit_pushdown_negative.q.out 115b171 
>   ql/src/test/results/clientpositive/metadataonly1.q.out 917efdf 
>   ql/src/test/results/clientpositive/multi_insert_gby2.q.out ab758cb 
>   ql/src/test/results/clientpositive/multi_insert_gby3.q.out 23ccebb 
>   ql/src/test/results/clientpositive/multi_insert_lateral_view.q.out 35e70b4 
>   ql/src/test/results/clientpositive/nullgroup.q.out 2ac7dea 
>   ql/src/test/results/clientpositive/nullgroup2.q.out cf31dc1 
>   ql/src/test/results/clientpo

Re: Review Request 20051: HIVE-4904: A little more CP crossing RS boundaries

2014-04-07 Thread John Pullokkaran

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20051/#review39706
---


I took a look at this change; my knowledge of hive code is rather limited.
1. Column Pruner doesn't cross Script operator boundary. Theoretically you 
could prune above and below the script op separately.
2. It seems column pruner assumes that parent of UDTF is always select; but we 
haven't formalized this assumption. Other processors should throw exception if 
it ever come across a child that is UDTF. Theoretically you can push down 
certain filters below builtin UDTF. We may not be doing that today.
3.  In Select Pruner it seems like there is no difference between 'prunedCols' 
and 'columns'.

- John Pullokkaran


On April 6, 2014, 1:33 a.m., Harish Butani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20051/
> ---
> 
> (Updated April 6, 2014, 1:33 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Navis Ryu.
> 
> 
> Bugs: HIVE-4904
> https://issues.apache.org/jira/browse/HIVE-4904
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Currently, CP context cannot be propagated over RS except for JOIN/EXT. A 
> little more CP is possible.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java 58a9b59 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcCtx.java 
> db36151 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
> 0690fb7 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 3f16dc2 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java
>  94224b3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 3b33dc2 
>   ql/src/test/queries/clientpositive/order_within_subquery.q PRE-CREATION 
>   ql/src/test/results/clientpositive/annotate_stats_select.q.out 1e982e6 
>   ql/src/test/results/clientpositive/auto_join18.q.out b8677f4 
>   ql/src/test/results/clientpositive/auto_join27.q.out a576190 
>   ql/src/test/results/clientpositive/auto_join30.q.out 8709198 
>   ql/src/test/results/clientpositive/auto_join31.q.out 1936e45 
>   ql/src/test/results/clientpositive/auto_join32.q.out 05f53e6 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out 8882aac 
>   ql/src/test/results/clientpositive/count.q.out eb048b6 
>   ql/src/test/results/clientpositive/distinct_stats.q.out f715ea3 
>   ql/src/test/results/clientpositive/groupby2_map.q.out 291f196 
>   ql/src/test/results/clientpositive/groupby2_map_skew.q.out d005b6c 
>   ql/src/test/results/clientpositive/groupby3_map.q.out 1dfee08 
>   ql/src/test/results/clientpositive/groupby3_map_skew.q.out 7af59bc 
>   ql/src/test/results/clientpositive/groupby_cube1.q.out 92d81f4 
>   ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out b405978 
>   ql/src/test/results/clientpositive/groupby_map_ppr.q.out 27eff75 
>   
> ql/src/test/results/clientpositive/groupby_multi_insert_common_distinct.q.out 
> ad76252 
>   ql/src/test/results/clientpositive/groupby_multi_single_reducer3.q.out 
> 51a70c4 
>   ql/src/test/results/clientpositive/groupby_position.q.out 727bccb 
>   ql/src/test/results/clientpositive/groupby_rollup1.q.out 36bf966 
>   ql/src/test/results/clientpositive/groupby_sort_11.q.out 8ee7571 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out a27cfaa 
>   ql/src/test/results/clientpositive/join18.q.out 7975c79 
>   ql/src/test/results/clientpositive/limit_pushdown.q.out 9c93ada 
>   ql/src/test/results/clientpositive/limit_pushdown_negative.q.out 115b171 
>   ql/src/test/results/clientpositive/metadataonly1.q.out 917efdf 
>   ql/src/test/results/clientpositive/multi_insert_gby2.q.out ab758cb 
>   ql/src/test/results/clientpositive/multi_insert_gby3.q.out 23ccebb 
>   ql/src/test/results/clientpositive/multi_insert_lateral_view.q.out 35e70b4 
>   ql/src/test/results/clientpositive/nullgroup.q.out 2ac7dea 
>   ql/src/test/results/clientpositive/nullgroup2.q.out cf31dc1 
>   ql/src/test/results/clientpositive/nullgroup4.q.out feae138 
>   ql/src/test/results/clientpositive/nullgroup4_multi_distinct.q.out 2ee357f 
>   ql/src/test/results/clientpositive/order_within_subquery.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out 
> 9c6d14e 
>   ql/src/test/results/clientpositive/udf_count.q.out fb45708 
>   ql/src/test/results/clientpositive/union11.q.out f226f35 
>   ql/src/test/results/clientpositive/union14.q.out a6d349b 
>   ql/src/test/results/clientpositive/union15.q.out 88c9553 
>   ql/src/test/results/clientpositive/union16.q.out 2bd8d5e 
>   ql/src/test/results/clientpositive/union2.q.out 0fac9d9 
>   

Re: Review Request 20051: HIVE-4904: A little more CP crossing RS boundaries

2014-04-05 Thread Harish Butani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20051/
---

(Updated April 6, 2014, 1:33 a.m.)


Review request for hive, Ashutosh Chauhan and Navis Ryu.


Changes
---

fix issues with PTF, LVJs and analyze commands


Bugs: HIVE-4904
https://issues.apache.org/jira/browse/HIVE-4904


Repository: hive-git


Description
---

Currently, CP context cannot be propagated over RS except for JOIN/EXT. A 
little more CP is possible.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java 58a9b59 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcCtx.java 
db36151 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
0690fb7 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 3f16dc2 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java
 94224b3 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 3b33dc2 
  ql/src/test/queries/clientpositive/order_within_subquery.q PRE-CREATION 
  ql/src/test/results/clientpositive/annotate_stats_select.q.out 1e982e6 
  ql/src/test/results/clientpositive/auto_join18.q.out b8677f4 
  ql/src/test/results/clientpositive/auto_join27.q.out a576190 
  ql/src/test/results/clientpositive/auto_join30.q.out 8709198 
  ql/src/test/results/clientpositive/auto_join31.q.out 1936e45 
  ql/src/test/results/clientpositive/auto_join32.q.out 05f53e6 
  ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out 8882aac 
  ql/src/test/results/clientpositive/count.q.out eb048b6 
  ql/src/test/results/clientpositive/distinct_stats.q.out f715ea3 
  ql/src/test/results/clientpositive/groupby2_map.q.out 291f196 
  ql/src/test/results/clientpositive/groupby2_map_skew.q.out d005b6c 
  ql/src/test/results/clientpositive/groupby3_map.q.out 1dfee08 
  ql/src/test/results/clientpositive/groupby3_map_skew.q.out 7af59bc 
  ql/src/test/results/clientpositive/groupby_cube1.q.out 92d81f4 
  ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out b405978 
  ql/src/test/results/clientpositive/groupby_map_ppr.q.out 27eff75 
  ql/src/test/results/clientpositive/groupby_multi_insert_common_distinct.q.out 
ad76252 
  ql/src/test/results/clientpositive/groupby_multi_single_reducer3.q.out 
51a70c4 
  ql/src/test/results/clientpositive/groupby_position.q.out 727bccb 
  ql/src/test/results/clientpositive/groupby_rollup1.q.out 36bf966 
  ql/src/test/results/clientpositive/groupby_sort_11.q.out 8ee7571 
  ql/src/test/results/clientpositive/groupby_sort_8.q.out a27cfaa 
  ql/src/test/results/clientpositive/join18.q.out 7975c79 
  ql/src/test/results/clientpositive/limit_pushdown.q.out 9c93ada 
  ql/src/test/results/clientpositive/limit_pushdown_negative.q.out 115b171 
  ql/src/test/results/clientpositive/metadataonly1.q.out 917efdf 
  ql/src/test/results/clientpositive/multi_insert_gby2.q.out ab758cb 
  ql/src/test/results/clientpositive/multi_insert_gby3.q.out 23ccebb 
  ql/src/test/results/clientpositive/multi_insert_lateral_view.q.out 35e70b4 
  ql/src/test/results/clientpositive/nullgroup.q.out 2ac7dea 
  ql/src/test/results/clientpositive/nullgroup2.q.out cf31dc1 
  ql/src/test/results/clientpositive/nullgroup4.q.out feae138 
  ql/src/test/results/clientpositive/nullgroup4_multi_distinct.q.out 2ee357f 
  ql/src/test/results/clientpositive/order_within_subquery.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out 9c6d14e 
  ql/src/test/results/clientpositive/udf_count.q.out fb45708 
  ql/src/test/results/clientpositive/union11.q.out f226f35 
  ql/src/test/results/clientpositive/union14.q.out a6d349b 
  ql/src/test/results/clientpositive/union15.q.out 88c9553 
  ql/src/test/results/clientpositive/union16.q.out 2bd8d5e 
  ql/src/test/results/clientpositive/union2.q.out 0fac9d9 
  ql/src/test/results/clientpositive/union25.q.out 1ebe682 
  ql/src/test/results/clientpositive/union28.q.out 4252062 
  ql/src/test/results/clientpositive/union3.q.out 7f1b7fb 
  ql/src/test/results/clientpositive/union30.q.out 194b3b8 
  ql/src/test/results/clientpositive/union31.q.out 2f7031f 
  ql/src/test/results/clientpositive/union5.q.out 0087393 
  ql/src/test/results/clientpositive/union7.q.out 3a2d88c 
  ql/src/test/results/clientpositive/union9.q.out c6cc511 
  ql/src/test/results/clientpositive/union_view.q.out 4eaaeaa 
  ql/src/test/results/clientpositive/vectorization_limit.q.out 4ffcd03 
  ql/src/test/results/compiler/plan/groupby2.q.xml 58623d9 
  ql/src/test/results/compiler/plan/groupby3.q.xml 65b403b 

Diff: https://reviews.apache.org/r/20051/diff/


Testing
---


Thanks,

Harish Butani



Re: Review Request 20051: HIVE-4904: A little more CP crossing RS boundaries

2014-04-04 Thread Harish Butani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20051/
---

(Updated April 5, 2014, 1:34 a.m.)


Review request for hive, Ashutosh Chauhan and Navis Ryu.


Changes
---

add test for HIVE-6819 case
update TestParse out files.


Bugs: HIVE-4904
https://issues.apache.org/jira/browse/HIVE-4904


Repository: hive-git


Description
---

Currently, CP context cannot be propagated over RS except for JOIN/EXT. A 
little more CP is possible.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java 58a9b59 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcCtx.java 
db36151 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
0690fb7 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 3f16dc2 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java
 94224b3 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 3b33dc2 
  ql/src/test/queries/clientpositive/order_within_subquery.q PRE-CREATION 
  ql/src/test/results/clientpositive/auto_join18.q.out b8677f4 
  ql/src/test/results/clientpositive/auto_join27.q.out a576190 
  ql/src/test/results/clientpositive/auto_join30.q.out 8709198 
  ql/src/test/results/clientpositive/auto_join31.q.out 1936e45 
  ql/src/test/results/clientpositive/auto_join32.q.out 05f53e6 
  ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out 8882aac 
  ql/src/test/results/clientpositive/count.q.out eb048b6 
  ql/src/test/results/clientpositive/groupby2_map.q.out 291f196 
  ql/src/test/results/clientpositive/groupby2_map_skew.q.out d005b6c 
  ql/src/test/results/clientpositive/groupby3_map.q.out 1dfee08 
  ql/src/test/results/clientpositive/groupby3_map_skew.q.out 7af59bc 
  ql/src/test/results/clientpositive/groupby_cube1.q.out 92d81f4 
  ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out b405978 
  ql/src/test/results/clientpositive/groupby_map_ppr.q.out 27eff75 
  ql/src/test/results/clientpositive/groupby_multi_insert_common_distinct.q.out 
ad76252 
  ql/src/test/results/clientpositive/groupby_multi_single_reducer3.q.out 
51a70c4 
  ql/src/test/results/clientpositive/groupby_position.q.out 727bccb 
  ql/src/test/results/clientpositive/groupby_rollup1.q.out 36bf966 
  ql/src/test/results/clientpositive/groupby_sort_11.q.out 8ee7571 
  ql/src/test/results/clientpositive/groupby_sort_8.q.out a27cfaa 
  ql/src/test/results/clientpositive/join18.q.out 7975c79 
  ql/src/test/results/clientpositive/metadataonly1.q.out 917efdf 
  ql/src/test/results/clientpositive/multi_insert_gby2.q.out ab758cb 
  ql/src/test/results/clientpositive/multi_insert_lateral_view.q.out 35e70b4 
  ql/src/test/results/clientpositive/nullgroup.q.out 2ac7dea 
  ql/src/test/results/clientpositive/nullgroup2.q.out cf31dc1 
  ql/src/test/results/clientpositive/nullgroup4.q.out feae138 
  ql/src/test/results/clientpositive/nullgroup4_multi_distinct.q.out 2ee357f 
  ql/src/test/results/clientpositive/order_within_subquery.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out 9c6d14e 
  ql/src/test/results/clientpositive/udf_count.q.out fb45708 
  ql/src/test/results/clientpositive/union11.q.out f226f35 
  ql/src/test/results/clientpositive/union14.q.out a6d349b 
  ql/src/test/results/clientpositive/union15.q.out 88c9553 
  ql/src/test/results/clientpositive/union16.q.out 2bd8d5e 
  ql/src/test/results/clientpositive/union2.q.out 0fac9d9 
  ql/src/test/results/clientpositive/union28.q.out 4252062 
  ql/src/test/results/clientpositive/union30.q.out 194b3b8 
  ql/src/test/results/clientpositive/union31.q.out 2f7031f 
  ql/src/test/results/clientpositive/union5.q.out 0087393 
  ql/src/test/results/clientpositive/union7.q.out 3a2d88c 
  ql/src/test/results/clientpositive/union9.q.out c6cc511 
  ql/src/test/results/compiler/plan/groupby2.q.xml 58623d9 
  ql/src/test/results/compiler/plan/groupby3.q.xml 65b403b 

Diff: https://reviews.apache.org/r/20051/diff/


Testing
---


Thanks,

Harish Butani