> On June 19, 2015, 3:42 a.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/CombineEquivalentWorkResolver.java,
> >  line 207
> > <https://reviews.apache.org/r/34757/diff/2/?file=986303#file986303line207>
> >
> >     I think in SparkWork, there couldn't be two parents connectting to the 
> > same child. UnionWork wold be such a child, but SparkWork doesn't have 
> > UnionWork, if I'm not mistaken.
> >     
> >     I don't think SparkPlan has a limitation of only link between to trans. 
> > If there are two links between a parent to a child, the input will be self 
> > unioned and the result is the input to the child.
> 
> chengxiang li wrote:
>     Take self-join for example, there would be 2 MapWork connect to same 
> ReduceWork. if we combine these 2 MapWorks into 1, SparkPlan::connect would 
> throw exception during SparkPlan generation.

I see. Thanks for the explanation. However, I'm wondering if we should remove 
the restriction. Otherwise, certain cases such as self join will not take the 
advantage of this feature, right?


> On June 19, 2015, 3:42 a.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/CombineEquivalentWorkResolver.java,
> >  line 157
> > <https://reviews.apache.org/r/34757/diff/2/?file=986303#file986303line157>
> >
> >     Could parents be null, in case of top-level works? Same for children.
> 
> chengxiang li wrote:
>     SparkWork always return not null List now, but it may changes, so it 
> always not harm to add null verification.

Yeah, if that's the case, the original code is cleaner and easier to read. If 
some changes, the tests might just catch the NPE.


- Xuefu


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34757/#review88484
-----------------------------------------------------------


On June 19, 2015, 7:22 a.m., chengxiang li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34757/
> -----------------------------------------------------------
> 
> (Updated June 19, 2015, 7:22 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-10844
>     https://issues.apache.org/jira/browse/HIVE-10844
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Some Hive queries(like TPCDS Q39) may share the same subquery, which 
> translated into sperate, but equivalent Works in SparkWork, combining these 
> equivalent Works into a single one would help to benifit from following 
> dynamic RDD caching optimization.
> 
> 
> Diffs
> -----
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/OperatorComparatorFactory.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/CombineEquivalentWorkResolver.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
> 19aae70 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/JoinCondDesc.java b307b16 
>   ql/src/test/results/clientpositive/spark/auto_join30.q.out 7b5c5e7 
>   ql/src/test/results/clientpositive/spark/auto_smb_mapjoin_14.q.out 8a43d78 
>   ql/src/test/results/clientpositive/spark/groupby10.q.out 9d3cf36 
>   ql/src/test/results/clientpositive/spark/groupby7_map.q.out abd6459 
>   ql/src/test/results/clientpositive/spark/groupby7_map_skew.q.out 5e69b31 
>   ql/src/test/results/clientpositive/spark/groupby7_noskew.q.out 3418b99 
>   
> ql/src/test/results/clientpositive/spark/groupby7_noskew_multi_single_reducer.q.out
>  2cb126d 
>   ql/src/test/results/clientpositive/spark/groupby8.q.out 307395f 
>   ql/src/test/results/clientpositive/spark/groupby8_map_skew.q.out ba04a57 
>   ql/src/test/results/clientpositive/spark/insert_into3.q.out 7df5ba8 
>   ql/src/test/results/clientpositive/spark/join22.q.out b1e5b67 
>   ql/src/test/results/clientpositive/spark/skewjoinopt11.q.out 8a278ef 
>   ql/src/test/results/clientpositive/spark/union10.q.out 5e8fe38 
>   ql/src/test/results/clientpositive/spark/union11.q.out 20c27c7 
>   ql/src/test/results/clientpositive/spark/union20.q.out 6f0dca6 
>   ql/src/test/results/clientpositive/spark/union28.q.out 98582df 
>   ql/src/test/results/clientpositive/spark/union3.q.out 834b6d4 
>   ql/src/test/results/clientpositive/spark/union30.q.out 3409623 
>   ql/src/test/results/clientpositive/spark/union4.q.out c121ef0 
>   ql/src/test/results/clientpositive/spark/union5.q.out afee988 
>   ql/src/test/results/clientpositive/spark/union_remove_1.q.out ba0e293 
>   ql/src/test/results/clientpositive/spark/union_remove_15.q.out 26cfbab 
>   ql/src/test/results/clientpositive/spark/union_remove_16.q.out 7a7aaf2 
>   ql/src/test/results/clientpositive/spark/union_remove_18.q.out a5e15c5 
>   ql/src/test/results/clientpositive/spark/union_remove_19.q.out ad44400 
>   ql/src/test/results/clientpositive/spark/union_remove_20.q.out 1d67177 
>   ql/src/test/results/clientpositive/spark/union_remove_21.q.out 9f5b070 
>   ql/src/test/results/clientpositive/spark/union_remove_22.q.out 2e01432 
>   ql/src/test/results/clientpositive/spark/union_remove_24.q.out 2659798 
>   ql/src/test/results/clientpositive/spark/union_remove_25.q.out 0a94684 
>   ql/src/test/results/clientpositive/spark/union_remove_4.q.out 6c3d596 
>   ql/src/test/results/clientpositive/spark/union_remove_6.q.out cd36189 
>   ql/src/test/results/clientpositive/spark/union_remove_6_subq.q.out c981ae4 
>   ql/src/test/results/clientpositive/spark/union_remove_7.q.out 084fbd6 
>   ql/src/test/results/clientpositive/spark/union_top_level.q.out dede1ef 
> 
> Diff: https://reviews.apache.org/r/34757/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> chengxiang li
> 
>

Reply via email to