[ 
https://issues.apache.org/jira/browse/HIVE-21950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Murshid Chalaev updated HIVE-21950:
-----------------------------------
    Summary: CBO complicates execution plan of queries with SUBSTR function in 
EXISTS clause  (was: Optimizer complicates execution plan of queries with 
SUBSTR function in EXISTS clause)

> CBO complicates execution plan of queries with SUBSTR function in EXISTS 
> clause
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-21950
>                 URL: https://issues.apache.org/jira/browse/HIVE-21950
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.3.0, 3.1.1
>            Reporter: Murshid Chalaev
>            Priority: Minor
>         Attachments: disabled_CBO.txt, enabled_CBO.txt
>
>
> Queries with SUBSTR function in EXISTS clause have much more complicated 
> execution plan in Hive-2.3 with enabled CBO then with disabled. A query below 
> has 8 stages which submit 4 MR jobs in Hive-2.3 when CBO is enabled, while 
> with disabled it has 4 stages and submits 1 MR job.
>  *STEPS TO REPRODUCE:*
> {code:java}
> CREATE TABLE i1122 (id STRING);
> INSERT INTO i1122 VALUES (1),(1001); 
> EXPLAIN
> SELECT *
> FROM i1122 AS t1
> WHERE EXISTS (
>     SELECT 1
>     FROM i1122 AS t2
>     WHERE t2.id = substr(t1.id,4)
> );{code}
> *ACTUAL RESULT:*
> Explain plan in Hive-2.3 with disabled CBO(The same execution plan was in 
> Hive-1.2 with enabled CBO):
> {code:java}
> hive> SET hive.cbo.enable=false;
> hive> EXPLAIN
>     > SELECT *
>     > FROM i1122 AS t1
>     > WHERE EXISTS (
>     >     SELECT 1
>     >     FROM i1122 AS t2
>     >     WHERE t2.id = substr(t1.id,4)
>     > );
> OK
> STAGE DEPENDENCIES:
>   Stage-4 is a root stage
>   Stage-3 depends on stages: Stage-4
>   Stage-0 depends on stages: Stage-3
> STAGE PLANS:
> ...
> {code}
> Explain plan in Hive-2.3 with enabled CBO:
> {code:java}
> hive> SET hive.cbo.enable=true;
> hive> EXPLAIN
>     > SELECT *
>     > FROM i1122 AS t1
>     > WHERE EXISTS (
>     >     SELECT 1
>     >     FROM i1122 AS t2
>     >     WHERE t2.id = substr(t1.id,4)
>     > );
> OK
> STAGE DEPENDENCIES:
>   Stage-3 is a root stage
>   Stage-8 depends on stages: Stage-3
>   Stage-6 depends on stages: Stage-8
>   Stage-5 depends on stages: Stage-6 , consists of Stage-7, Stage-1
>   Stage-7 has a backup stage: Stage-1
>   Stage-4 depends on stages: Stage-7
>   Stage-1
>   Stage-0 depends on stages: Stage-4, Stage-1
> STAGE PLANS:
> ...
> {code}
> Full explain plans you can find in attachment



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to