Re: Review Request 20051: HIVE-4904: A little more CP crossing RS boundaries
> On April 7, 2014, 6:03 p.m., John Pullokkaran wrote: > > I took a look at this change; my knowledge of hive code is rather limited. > > 1. Column Pruner doesn't cross Script operator boundary. Theoretically you > > could prune above and below the script op separately. > > 2. It seems column pruner assumes that parent of UDTF is always select; but > > we haven't formalized this assumption. Other processors should throw > > exception if it ever come across a child that is UDTF. Theoretically you > > can push down certain filters below builtin UDTF. We may not be doing that > > today. > > 3. In Select Pruner it seems like there is no difference between > > 'prunedCols' and 'columns'. Thanks John.Here are responses to your points 1. Column Pruner doesn't cross Script operator boundary. The ColumnPrunerWalker explicitly stops at the SelectOp parent of a ScriptOp. This may have been ok when developed; as you point out now it makes sense to continue pruning on the SelectOp ancestors. Can you file a jira for this. 2. The check in ColumnPrunerSelectProc is needed for the LVJoin case, where for the UDTFOp you end up with a empty PrunedList. What I realized was that Navis's fix doesn't cover the LVJoin case. Yes this should be revisited. - Harish --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20051/#review39706 --- On April 6, 2014, 1:33 a.m., Harish Butani wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/20051/ > --- > > (Updated April 6, 2014, 1:33 a.m.) > > > Review request for hive, Ashutosh Chauhan and Navis Ryu. > > > Bugs: HIVE-4904 > https://issues.apache.org/jira/browse/HIVE-4904 > > > Repository: hive-git > > > Description > --- > > Currently, CP context cannot be propagated over RS except for JOIN/EXT. A > little more CP is possible. > > > Diffs > - > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java 58a9b59 > ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcCtx.java > db36151 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java > 0690fb7 > ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 3f16dc2 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java > 94224b3 > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 3b33dc2 > ql/src/test/queries/clientpositive/order_within_subquery.q PRE-CREATION > ql/src/test/results/clientpositive/annotate_stats_select.q.out 1e982e6 > ql/src/test/results/clientpositive/auto_join18.q.out b8677f4 > ql/src/test/results/clientpositive/auto_join27.q.out a576190 > ql/src/test/results/clientpositive/auto_join30.q.out 8709198 > ql/src/test/results/clientpositive/auto_join31.q.out 1936e45 > ql/src/test/results/clientpositive/auto_join32.q.out 05f53e6 > ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out 8882aac > ql/src/test/results/clientpositive/count.q.out eb048b6 > ql/src/test/results/clientpositive/distinct_stats.q.out f715ea3 > ql/src/test/results/clientpositive/groupby2_map.q.out 291f196 > ql/src/test/results/clientpositive/groupby2_map_skew.q.out d005b6c > ql/src/test/results/clientpositive/groupby3_map.q.out 1dfee08 > ql/src/test/results/clientpositive/groupby3_map_skew.q.out 7af59bc > ql/src/test/results/clientpositive/groupby_cube1.q.out 92d81f4 > ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out b405978 > ql/src/test/results/clientpositive/groupby_map_ppr.q.out 27eff75 > > ql/src/test/results/clientpositive/groupby_multi_insert_common_distinct.q.out > ad76252 > ql/src/test/results/clientpositive/groupby_multi_single_reducer3.q.out > 51a70c4 > ql/src/test/results/clientpositive/groupby_position.q.out 727bccb > ql/src/test/results/clientpositive/groupby_rollup1.q.out 36bf966 > ql/src/test/results/clientpositive/groupby_sort_11.q.out 8ee7571 > ql/src/test/results/clientpositive/groupby_sort_8.q.out a27cfaa > ql/src/test/results/clientpositive/join18.q.out 7975c79 > ql/src/test/results/clientpositive/limit_pushdown.q.out 9c93ada > ql/src/test/results/clientpositive/limit_pushdown_negative.q.out 115b171 > ql/src/test/results/clientpositive/metadataonly1.q.out 917efdf > ql/src/test/results/clientpositive/multi_insert_gby2.q.out ab758cb > ql/src/test/results/clientpositive/multi_insert_gby3.q.out 23ccebb > ql/src/test/results/clientpositive/multi_insert_lateral_view.q.out 35e70b4 > ql/src/test/results/clientpositive/nullgroup.q.out 2ac7dea > ql/src/test/results/clientpositive/nullgroup2.q.out cf31dc1 > ql/src/test/results/clientpo
Re: Review Request 20051: HIVE-4904: A little more CP crossing RS boundaries
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20051/#review39706 --- I took a look at this change; my knowledge of hive code is rather limited. 1. Column Pruner doesn't cross Script operator boundary. Theoretically you could prune above and below the script op separately. 2. It seems column pruner assumes that parent of UDTF is always select; but we haven't formalized this assumption. Other processors should throw exception if it ever come across a child that is UDTF. Theoretically you can push down certain filters below builtin UDTF. We may not be doing that today. 3. In Select Pruner it seems like there is no difference between 'prunedCols' and 'columns'. - John Pullokkaran On April 6, 2014, 1:33 a.m., Harish Butani wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/20051/ > --- > > (Updated April 6, 2014, 1:33 a.m.) > > > Review request for hive, Ashutosh Chauhan and Navis Ryu. > > > Bugs: HIVE-4904 > https://issues.apache.org/jira/browse/HIVE-4904 > > > Repository: hive-git > > > Description > --- > > Currently, CP context cannot be propagated over RS except for JOIN/EXT. A > little more CP is possible. > > > Diffs > - > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java 58a9b59 > ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcCtx.java > db36151 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java > 0690fb7 > ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 3f16dc2 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java > 94224b3 > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 3b33dc2 > ql/src/test/queries/clientpositive/order_within_subquery.q PRE-CREATION > ql/src/test/results/clientpositive/annotate_stats_select.q.out 1e982e6 > ql/src/test/results/clientpositive/auto_join18.q.out b8677f4 > ql/src/test/results/clientpositive/auto_join27.q.out a576190 > ql/src/test/results/clientpositive/auto_join30.q.out 8709198 > ql/src/test/results/clientpositive/auto_join31.q.out 1936e45 > ql/src/test/results/clientpositive/auto_join32.q.out 05f53e6 > ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out 8882aac > ql/src/test/results/clientpositive/count.q.out eb048b6 > ql/src/test/results/clientpositive/distinct_stats.q.out f715ea3 > ql/src/test/results/clientpositive/groupby2_map.q.out 291f196 > ql/src/test/results/clientpositive/groupby2_map_skew.q.out d005b6c > ql/src/test/results/clientpositive/groupby3_map.q.out 1dfee08 > ql/src/test/results/clientpositive/groupby3_map_skew.q.out 7af59bc > ql/src/test/results/clientpositive/groupby_cube1.q.out 92d81f4 > ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out b405978 > ql/src/test/results/clientpositive/groupby_map_ppr.q.out 27eff75 > > ql/src/test/results/clientpositive/groupby_multi_insert_common_distinct.q.out > ad76252 > ql/src/test/results/clientpositive/groupby_multi_single_reducer3.q.out > 51a70c4 > ql/src/test/results/clientpositive/groupby_position.q.out 727bccb > ql/src/test/results/clientpositive/groupby_rollup1.q.out 36bf966 > ql/src/test/results/clientpositive/groupby_sort_11.q.out 8ee7571 > ql/src/test/results/clientpositive/groupby_sort_8.q.out a27cfaa > ql/src/test/results/clientpositive/join18.q.out 7975c79 > ql/src/test/results/clientpositive/limit_pushdown.q.out 9c93ada > ql/src/test/results/clientpositive/limit_pushdown_negative.q.out 115b171 > ql/src/test/results/clientpositive/metadataonly1.q.out 917efdf > ql/src/test/results/clientpositive/multi_insert_gby2.q.out ab758cb > ql/src/test/results/clientpositive/multi_insert_gby3.q.out 23ccebb > ql/src/test/results/clientpositive/multi_insert_lateral_view.q.out 35e70b4 > ql/src/test/results/clientpositive/nullgroup.q.out 2ac7dea > ql/src/test/results/clientpositive/nullgroup2.q.out cf31dc1 > ql/src/test/results/clientpositive/nullgroup4.q.out feae138 > ql/src/test/results/clientpositive/nullgroup4_multi_distinct.q.out 2ee357f > ql/src/test/results/clientpositive/order_within_subquery.q.out PRE-CREATION > ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out > 9c6d14e > ql/src/test/results/clientpositive/udf_count.q.out fb45708 > ql/src/test/results/clientpositive/union11.q.out f226f35 > ql/src/test/results/clientpositive/union14.q.out a6d349b > ql/src/test/results/clientpositive/union15.q.out 88c9553 > ql/src/test/results/clientpositive/union16.q.out 2bd8d5e > ql/src/test/results/clientpositive/union2.q.out 0fac9d9 >
Re: Review Request 20051: HIVE-4904: A little more CP crossing RS boundaries
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20051/ --- (Updated April 6, 2014, 1:33 a.m.) Review request for hive, Ashutosh Chauhan and Navis Ryu. Changes --- fix issues with PTF, LVJs and analyze commands Bugs: HIVE-4904 https://issues.apache.org/jira/browse/HIVE-4904 Repository: hive-git Description --- Currently, CP context cannot be propagated over RS except for JOIN/EXT. A little more CP is possible. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java 58a9b59 ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcCtx.java db36151 ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 0690fb7 ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 3f16dc2 ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java 94224b3 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 3b33dc2 ql/src/test/queries/clientpositive/order_within_subquery.q PRE-CREATION ql/src/test/results/clientpositive/annotate_stats_select.q.out 1e982e6 ql/src/test/results/clientpositive/auto_join18.q.out b8677f4 ql/src/test/results/clientpositive/auto_join27.q.out a576190 ql/src/test/results/clientpositive/auto_join30.q.out 8709198 ql/src/test/results/clientpositive/auto_join31.q.out 1936e45 ql/src/test/results/clientpositive/auto_join32.q.out 05f53e6 ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out 8882aac ql/src/test/results/clientpositive/count.q.out eb048b6 ql/src/test/results/clientpositive/distinct_stats.q.out f715ea3 ql/src/test/results/clientpositive/groupby2_map.q.out 291f196 ql/src/test/results/clientpositive/groupby2_map_skew.q.out d005b6c ql/src/test/results/clientpositive/groupby3_map.q.out 1dfee08 ql/src/test/results/clientpositive/groupby3_map_skew.q.out 7af59bc ql/src/test/results/clientpositive/groupby_cube1.q.out 92d81f4 ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out b405978 ql/src/test/results/clientpositive/groupby_map_ppr.q.out 27eff75 ql/src/test/results/clientpositive/groupby_multi_insert_common_distinct.q.out ad76252 ql/src/test/results/clientpositive/groupby_multi_single_reducer3.q.out 51a70c4 ql/src/test/results/clientpositive/groupby_position.q.out 727bccb ql/src/test/results/clientpositive/groupby_rollup1.q.out 36bf966 ql/src/test/results/clientpositive/groupby_sort_11.q.out 8ee7571 ql/src/test/results/clientpositive/groupby_sort_8.q.out a27cfaa ql/src/test/results/clientpositive/join18.q.out 7975c79 ql/src/test/results/clientpositive/limit_pushdown.q.out 9c93ada ql/src/test/results/clientpositive/limit_pushdown_negative.q.out 115b171 ql/src/test/results/clientpositive/metadataonly1.q.out 917efdf ql/src/test/results/clientpositive/multi_insert_gby2.q.out ab758cb ql/src/test/results/clientpositive/multi_insert_gby3.q.out 23ccebb ql/src/test/results/clientpositive/multi_insert_lateral_view.q.out 35e70b4 ql/src/test/results/clientpositive/nullgroup.q.out 2ac7dea ql/src/test/results/clientpositive/nullgroup2.q.out cf31dc1 ql/src/test/results/clientpositive/nullgroup4.q.out feae138 ql/src/test/results/clientpositive/nullgroup4_multi_distinct.q.out 2ee357f ql/src/test/results/clientpositive/order_within_subquery.q.out PRE-CREATION ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out 9c6d14e ql/src/test/results/clientpositive/udf_count.q.out fb45708 ql/src/test/results/clientpositive/union11.q.out f226f35 ql/src/test/results/clientpositive/union14.q.out a6d349b ql/src/test/results/clientpositive/union15.q.out 88c9553 ql/src/test/results/clientpositive/union16.q.out 2bd8d5e ql/src/test/results/clientpositive/union2.q.out 0fac9d9 ql/src/test/results/clientpositive/union25.q.out 1ebe682 ql/src/test/results/clientpositive/union28.q.out 4252062 ql/src/test/results/clientpositive/union3.q.out 7f1b7fb ql/src/test/results/clientpositive/union30.q.out 194b3b8 ql/src/test/results/clientpositive/union31.q.out 2f7031f ql/src/test/results/clientpositive/union5.q.out 0087393 ql/src/test/results/clientpositive/union7.q.out 3a2d88c ql/src/test/results/clientpositive/union9.q.out c6cc511 ql/src/test/results/clientpositive/union_view.q.out 4eaaeaa ql/src/test/results/clientpositive/vectorization_limit.q.out 4ffcd03 ql/src/test/results/compiler/plan/groupby2.q.xml 58623d9 ql/src/test/results/compiler/plan/groupby3.q.xml 65b403b Diff: https://reviews.apache.org/r/20051/diff/ Testing --- Thanks, Harish Butani
Re: Review Request 20051: HIVE-4904: A little more CP crossing RS boundaries
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20051/ --- (Updated April 5, 2014, 1:34 a.m.) Review request for hive, Ashutosh Chauhan and Navis Ryu. Changes --- add test for HIVE-6819 case update TestParse out files. Bugs: HIVE-4904 https://issues.apache.org/jira/browse/HIVE-4904 Repository: hive-git Description --- Currently, CP context cannot be propagated over RS except for JOIN/EXT. A little more CP is possible. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java 58a9b59 ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcCtx.java db36151 ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 0690fb7 ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 3f16dc2 ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java 94224b3 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 3b33dc2 ql/src/test/queries/clientpositive/order_within_subquery.q PRE-CREATION ql/src/test/results/clientpositive/auto_join18.q.out b8677f4 ql/src/test/results/clientpositive/auto_join27.q.out a576190 ql/src/test/results/clientpositive/auto_join30.q.out 8709198 ql/src/test/results/clientpositive/auto_join31.q.out 1936e45 ql/src/test/results/clientpositive/auto_join32.q.out 05f53e6 ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out 8882aac ql/src/test/results/clientpositive/count.q.out eb048b6 ql/src/test/results/clientpositive/groupby2_map.q.out 291f196 ql/src/test/results/clientpositive/groupby2_map_skew.q.out d005b6c ql/src/test/results/clientpositive/groupby3_map.q.out 1dfee08 ql/src/test/results/clientpositive/groupby3_map_skew.q.out 7af59bc ql/src/test/results/clientpositive/groupby_cube1.q.out 92d81f4 ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out b405978 ql/src/test/results/clientpositive/groupby_map_ppr.q.out 27eff75 ql/src/test/results/clientpositive/groupby_multi_insert_common_distinct.q.out ad76252 ql/src/test/results/clientpositive/groupby_multi_single_reducer3.q.out 51a70c4 ql/src/test/results/clientpositive/groupby_position.q.out 727bccb ql/src/test/results/clientpositive/groupby_rollup1.q.out 36bf966 ql/src/test/results/clientpositive/groupby_sort_11.q.out 8ee7571 ql/src/test/results/clientpositive/groupby_sort_8.q.out a27cfaa ql/src/test/results/clientpositive/join18.q.out 7975c79 ql/src/test/results/clientpositive/metadataonly1.q.out 917efdf ql/src/test/results/clientpositive/multi_insert_gby2.q.out ab758cb ql/src/test/results/clientpositive/multi_insert_lateral_view.q.out 35e70b4 ql/src/test/results/clientpositive/nullgroup.q.out 2ac7dea ql/src/test/results/clientpositive/nullgroup2.q.out cf31dc1 ql/src/test/results/clientpositive/nullgroup4.q.out feae138 ql/src/test/results/clientpositive/nullgroup4_multi_distinct.q.out 2ee357f ql/src/test/results/clientpositive/order_within_subquery.q.out PRE-CREATION ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out 9c6d14e ql/src/test/results/clientpositive/udf_count.q.out fb45708 ql/src/test/results/clientpositive/union11.q.out f226f35 ql/src/test/results/clientpositive/union14.q.out a6d349b ql/src/test/results/clientpositive/union15.q.out 88c9553 ql/src/test/results/clientpositive/union16.q.out 2bd8d5e ql/src/test/results/clientpositive/union2.q.out 0fac9d9 ql/src/test/results/clientpositive/union28.q.out 4252062 ql/src/test/results/clientpositive/union30.q.out 194b3b8 ql/src/test/results/clientpositive/union31.q.out 2f7031f ql/src/test/results/clientpositive/union5.q.out 0087393 ql/src/test/results/clientpositive/union7.q.out 3a2d88c ql/src/test/results/clientpositive/union9.q.out c6cc511 ql/src/test/results/compiler/plan/groupby2.q.xml 58623d9 ql/src/test/results/compiler/plan/groupby3.q.xml 65b403b Diff: https://reviews.apache.org/r/20051/diff/ Testing --- Thanks, Harish Butani