> On June 15, 2016, 3:23 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java,
> >  line 247
> > <https://reviews.apache.org/r/48500/diff/4/?file=1416719#file1416719line247>
> >
> >     Can you add a comment that why we don't care for sql type?

This is taken from line 146-147 in original version of ASTConverter. We do not 
care about the type because we are just converting to AST (and thus, we just 
create the references to extract the column names for the GBy).


> On June 15, 2016, 3:23 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/vector_groupby_reduce.q.out, line 788
> > <https://reviews.apache.org/r/48500/diff/4/?file=1416731#file1416731line788>
> >
> >     Plan change expected?

This is expected.
* In the original query, we have: GBy1 x,y - GBy2 x,y - OBy x,y.
* Calcite was returning: GBy1 y,x - GBy2 y,x - OBy x,y. Note that we cannot do 
anything about this, as Calcite consider GBy1 columns in the order given by the 
underlying expression, and in the case, in the order the columns are present in 
the TableScan. RS dedup was kicking in for GBy1 y,x - GBy2 y,x.
* With this patch, when we translate to AST, we get: GBy1 y,x - GBy2 x,y - OBy 
x,y. Observe that order of columns of last GBy is transformed to respect order 
of columns of OBy. However, RS does not kick in for GBy2 x,y - OBy x,y (because 
of number of reducers).

If we want to solve the problem completely, I have just thought that I could 
modify the patch and create a Calcite rule that takes care of this propagation 
through the whole operator tree top-down: gains could be potentially big. Then, 
we do not need to modify AST converter. Please, let me know what you think and 
I will proceed accordingly.


- Jesús


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48500/#review137744
-----------------------------------------------------------


On June 13, 2016, 12:06 p.m., Jesús Camacho Rodríguez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48500/
> -----------------------------------------------------------
> 
> (Updated June 13, 2016, 12:06 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-13982
>     https://issues.apache.org/jira/browse/HIVE-13982
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-13982
> 
> 
> Diffs
> -----
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java
>  353d8db41af10512c94c0700a9bb06a07d660190 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java
>  77771c3eb8155defa99a223ccf4ee4b072abb40a 
>   ql/src/test/queries/clientpositive/limit_pushdown2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/bucket_groupby.q.out 
> e198617c82b8ab4c3ad3d8b255975413fbdc382d 
>   ql/src/test/results/clientpositive/limit_pushdown2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/lineage3.q.out 
> 12ae13e388b3cb9c051cb419b75682fa4296d211 
>   ql/src/test/results/clientpositive/perf/query45.q.out 
> 04f9b02b019b6cf591dee48964a73fdb4a4b285f 
>   ql/src/test/results/clientpositive/spark/vectorization_14.q.out 
> cb3d9a4da84a379e00550ce7e31893b304d5e560 
>   ql/src/test/results/clientpositive/tez/explainuser_1.q.out 
> 1871c7e443cf775b09badc4cbf4b86e23ad9e525 
>   ql/src/test/results/clientpositive/tez/explainuser_2.q.out 
> 553066039881f225634c08d93a9054df5636e5d2 
>   ql/src/test/results/clientpositive/tez/vector_groupby_reduce.q.out 
> 7f00b064e5a91b45282823e2725e11ab7f508b01 
>   ql/src/test/results/clientpositive/tez/vectorization_14.q.out 
> 2a598332207f4540defa21a107642aa0502e1a58 
>   ql/src/test/results/clientpositive/vector_groupby_reduce.q.out 
> bc23b365b02b505d0f8e79cdacca3449bf46ead3 
>   ql/src/test/results/clientpositive/vectorization_14.q.out 
> 6d4f13a23de5c184cd100af07ac19f24ba9fac4a 
> 
> Diff: https://reviews.apache.org/r/48500/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jesús Camacho Rodríguez
> 
>

Reply via email to