> On Nov. 5, 2014, 9:23 p.m., Szehon Ho wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java,
> >  line 254
> > <https://reviews.apache.org/r/27640/diff/1/?file=750693#file750693line254>
> >
> >     Are you sure we dont need to initialize the HTSOperator's values like 
> > it does in LocalMapJoinProcFactory?
> 
> Suhas Satish wrote:
>     I will take a closer look.

I dug into the history of this changeset a bit. 
It was introduced in this commit 
https://github.com/apache/hive/commit/9b4ba6a9bb2a1184857fc8cca11e3dc6c48c1380

>From one of the comments on HIVE-4867, 
there is a problem in mapjoin on tez. MR compiler replaces RS with HashSink 
made from value exprs of Join but Tez compiler uses RS as is,  assuming it has 
same columns with value exprs of Join, which is not true

HIVE-4867 dedups columns in RS for reducer join and RS for order-by. But small 
aliases of mapjoin of MR tasks still contains key columns in value exprs.
 
Not having this can at worst, be a performance issue on memory (slightly larger 
footprint) but not impact functionality.


- Suhas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27640/#review60031
-----------------------------------------------------------


On Nov. 5, 2014, 8:29 p.m., Suhas Satish wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27640/
> -----------------------------------------------------------
> 
> (Updated Nov. 5, 2014, 8:29 p.m.)
> 
> 
> Review request for hive, Chao Sun, Jimmy Xiang, Szehon Ho, and Xuefu Zhang.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This replaces ReduceSinks with HashTableSinks in smaller tables for a 
> map-join. But the condition check field to detect map-join is actually being 
> set in CommonJoinResolver, which doesnt exist yet. We need to decide where is 
> the right place to populate this field. 
> 
> 
> Diffs
> -----
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
> 795a5d7 
> 
> Diff: https://reviews.apache.org/r/27640/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Suhas Satish
> 
>

Reply via email to