[ https://issues.apache.org/jira/browse/HAWQ-830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ruilong Huo reassigned HAWQ-830: -------------------------------- Assignee: Ruilong Huo (was: Amr El-Helw) > Wrong result in CTE query due to CTE is treated as init plan by planner and > evaluated multiple times > ---------------------------------------------------------------------------------------------------- > > Key: HAWQ-830 > URL: https://issues.apache.org/jira/browse/HAWQ-830 > Project: Apache HAWQ > Issue Type: Bug > Components: Optimizer > Reporter: Ruilong Huo > Assignee: Ruilong Huo > > In CTE query, if the CTE itself is referenced multiple times, it should be > evaluated only once and then be used multiple time. However, it is treated as > init plan and evaluated multiple times in hawq 1.x and 2.0. This has two > issues here: > 1. If the query in CTE is "volatile" (i.e., select volatile function) or has > side effect (create/drop object in database), it may generate wrong result > 2. The performance of the query is not so efficient since the query in CTE is > evaluated multiple times. > Here is the steps to reproduce: > 1) in hawq, CTE is treated as init plan and evaluated 2 times. Thus, the > result is incorrect > {noformat} > WITH r AS (SELECT random()) > SELECT r1.*, r2.* > FROM r AS r1, r AS r2; > random | random > -------------------+------------------- > 0.519145511090755 | 0.751198637764901 > (1 row) > EXPLAIN > WITH r AS (SELECT random()) > SELECT r1.*, r2.* > FROM r AS r1, r AS r2; > QUERY PLAN > -------------------------------------------------------------- > Nested Loop (cost=0.04..0.77 rows=20 width=16) > -> Result (cost=0.01..0.02 rows=1 width=0) > InitPlan > -> Result (cost=0.00..0.01 rows=1 width=0) > -> Materialize (cost=0.03..0.09 rows=6 width=8) > -> Result (cost=0.01..0.02 rows=1 width=0) > InitPlan > -> Result (cost=0.00..0.01 rows=1 width=0) > Settings: default_hash_table_bucket_number=6 > Optimizer status: legacy query optimizer > (10 rows) > {noformat} > 2) in postgres, CTE is treated as CTE scan and evaluated 1 time. Thus, the > result is i > {noformat} > WITH r AS (SELECT random()) > SELECT r1.*, r2.* > FROM r AS r1, r AS r2; > random | random > -------------------+------------------- > 0.989214501809329 | 0.989214501809329 > (1 row) > EXPLAIN > WITH r AS (SELECT random()) > SELECT r1.*, r2.* > FROM r AS r1, r AS r2; > QUERY PLAN > ---------------------------------------------------------- > Nested Loop (cost=0.01..0.06 rows=1 width=16) > CTE r > -> Result (cost=0.00..0.01 rows=1 width=0) > -> CTE Scan on r r1 (cost=0.00..0.02 rows=1 width=8) > -> CTE Scan on r r2 (cost=0.00..0.02 rows=1 width=8) > (5 rows){noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)