[ 
https://issues.apache.org/jira/browse/HIVE-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585269#comment-13585269
 ] 

Navis commented on HIVE-948:
----------------------------

Making the result of "udf_reflect2", I've realized that it's better not to 
merge two SEL operators if child SEL references column of parent SEL which is 
result of function twice or more. For example,

select reflect2(ts, "getYear"), reflect2(ts, "getMonth") from (select cast(key 
as timestamp) as ts from tbl) a;

                
> more query plan optimization rules 
> -----------------------------------
>
>                 Key: HIVE-948
>                 URL: https://issues.apache.org/jira/browse/HIVE-948
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Navis
>         Attachments: HIVE-948.D8463.1.patch, HIVE-948.D8463.2.patch, 
> HIVE-948.D8463.3.patch, HIVE-948.D8463.3.patch, HIVE-948.D8463.4.patch, 
> HIVE-948.D8463.5.patch, HIVE-948.testresult_only.txt
>
>
> Many query plans are not optimal in that they contain redundant operators. 
> Some examples are unnecessary select operators (select followed by select, 
> select output being the same as input etc.). Even though these operators are 
> not very expensive, they could account for around 10% of CPU time in some 
> simple queries. It seems they are low-hanging fruits that we should pick 
> first. 
> BTW, it seems these optimization rules should be added at the last stage of 
> the physical optimization phase since some redundant operators are added to 
> facilitate physical plan generation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to