[ 
https://issues.apache.org/jira/browse/HIVE-24999?focusedWorklogId=590989&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-590989
 ]

ASF GitHub Bot logged work on HIVE-24999:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 29/Apr/21 13:36
            Start Date: 29/Apr/21 13:36
    Worklog Time Spent: 10m 
      Work Description: zabetak commented on pull request #2172:
URL: https://github.com/apache/hive/pull/2172#issuecomment-829244102


   Closing as the fix was merged in 
https://github.com/apache/hive/commit/4f4cbeda00d5ebb7d0b8cedee5daa2c03df4a755.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 590989)
    Remaining Estimate: 0h
            Time Spent: 10m

> HiveSubQueryRemoveRule generates invalid plan for IN subquery with multiple 
> correlations
> ----------------------------------------------------------------------------------------
>
>                 Key: HIVE-24999
>                 URL: https://issues.apache.org/jira/browse/HIVE-24999
>             Project: Hive
>          Issue Type: Bug
>          Components: CBO
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>             Fix For: 4.0.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The problem can be reproduced by using the following query which at the 
> moment can be found in {{subquery_in.q}} file:
> {code:sql}
> explain cbo select * from part where p_name IN (select p_name from part p 
> where p.p_size = part.p_size AND part.p_size + 121150 = p.p_partkey );
> {code}
> The plans before and after {{HiveSubQueryRemoveRule}} are shown below:
> {noformat}
> 2021-04-09T14:29:08,031 DEBUG [9f8b0342-5609-4917-95a9-e7abc884f619 main] 
> parse.CalcitePlanner: Plan before removing subquery:
> HiveProject(p_partkey=[$0], p_name=[$1], p_mfgr=[$2], p_brand=[$3], 
> p_type=[$4], p_size=[$5], p_container=[$6], p_retailprice=[$7], 
> p_comment=[$8])
>   HiveFilter(condition=[IN($1, {
> HiveProject(p_name=[$1])
>   HiveFilter(condition=[AND(=($5, $cor0.p_size), =(+($cor0.p_size, 121150), 
> $0))])
>     HiveTableScan(table=[[default, part]], table:alias=[p])
> })])
>     HiveTableScan(table=[[default, part]], table:alias=[part])
> 2021-04-09T14:29:08,056 DEBUG [9f8b0342-5609-4917-95a9-e7abc884f619 main] 
> parse.CalcitePlanner: Plan just after removing subquery:
> HiveProject(p_partkey=[$0], p_name=[$1], p_mfgr=[$2], p_brand=[$3], 
> p_type=[$4], p_size=[$5], p_container=[$6], p_retailprice=[$7], 
> p_comment=[$8])
>   HiveFilter(condition=[=($1, $12)])
>     LogicalCorrelate(correlation=[$cor0], joinType=[semi], 
> requiredColumns=[{5}])
>       HiveTableScan(table=[[default, part]], table:alias=[part])
>       HiveProject(p_name=[$1])
>         HiveFilter(condition=[AND(=($5, $cor0.p_size), =(+($cor0.p_size, 
> 121150), $0))])
>           HiveTableScan(table=[[default, part]], table:alias=[p])
> {noformat}
> The plan after applying the rule is invalid. The 
> {{HiveFilter(condition=[=($1, $12)])}} above the correlate references columns 
> ($12) from the right input which do not exist since the correlate is of type 
> SEMI. Running the test with {{-Dcalcite.debug}} property enabled raises an 
> {{AssertionError}} when building the {{HiveFilter}}.
> The problem is hidden at the moment since there is a specific hack in 
> {{HiveRelDecorrelator}} that turns this invalid plan into a valid one. This 
> mechanism is very brittle and it can break easily as it happened while fixing 
> HIVE-24957.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to