[jira] [Commented] (HIVE-17342) Where condition with 1=0 should be treated similar to limit 0

Jacques (Jira) Wed, 17 Aug 2022 01:17:05 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580663#comment-17580663
 ]


Jacques commented on HIVE-17342:
--------------------------------

I just want to add my support for this Jira. We have a scenario where a third 
party tool generates the "WHERE 0=1" syntax, so we have no control to change it 
to LIMIT 0. In our use-case, the datawarehouse design makes use of Hive views a 
lot as well, which makes the issue even worse (executing the query on a view 
that is often quite complex).

The expectation is that Hive should be able to optimize the execution away, 
similar to databases like SQL Server / etc, or like Hive itself is already 
doing with LIMIT 0.

The impact to our development team is actually quite severe, since these type 
of queries are executed in the background by the tool regularly in their dev 
process - leading to minutes long wait times.

> Where condition with 1=0 should be treated similar to limit 0
> -------------------------------------------------------------
>
>                 Key: HIVE-17342
>                 URL: https://issues.apache.org/jira/browse/HIVE-17342
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Priority: Minor
>
> In some cases, queries may get executed with where condition mentioning to 
> "1=0" to get schema. E.g 
> {noformat}
> SELECT * FROM (select avg(d_year) as  y from date_dim where d_year>1999) q 
> WHERE 1=0
> {noformat}
> Currently hive executes the query; it would be good to consider this similar 
> to "limit 0" which does not execute the query.
> {code}
> hive> explain SELECT * FROM (select avg(d_year) as  y from date_dim where 
> d_year>1999) q WHERE 1=0;
> OK
> Plan optimized by CBO.
> Vertex dependency in root stage
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
>     limit:-1
>     Stage-1
>       Reducer 2 vectorized, llap
>       File Output Operator [FS_13]
>         Group By Operator [GBY_12] (rows=1 width=76)
>           Output:["_col0"],aggregations:["avg(VALUE._col0)"]
>         <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized, llap
>           PARTITION_ONLY_SHUFFLE [RS_11]
>             Group By Operator [GBY_10] (rows=1 width=76)
>               Output:["_col0"],aggregations:["avg(d_year)"]
>               Filter Operator [FIL_9] (rows=1 width=0)
>                 predicate:false
>                 TableScan [TS_0] (rows=1 width=0)
>                   
> default@date_dim,date_dim,Tbl:PARTIAL,Col:NONE,Output:["d_year"]
> {code}
> It does generate 0 splits, but does send a DAG plan to the AM and receive 0 
> rows as output.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-17342) Where condition with 1=0 should be treated similar to limit 0

Reply via email to