[ 
https://issues.apache.org/jira/browse/CALCITE-7636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alessandro Solimando updated CALCITE-7636:
------------------------------------------
    Summary: Materialized view union rewriting drops rows where the view filter 
is not null-rejecting  (was: Materialized view union rewriting drops rows where 
the view filter is UNKNOWN)

> Materialized view union rewriting drops rows where the view filter is not 
> null-rejecting
> ----------------------------------------------------------------------------------------
>
>                 Key: CALCITE-7636
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7636
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.42.0
>            Reporter: Alessandro Solimando
>            Assignee: Alessandro Solimando
>            Priority: Major
>             Fix For: 1.43.0
>
>
> When a query is wider than a materialized view, _MaterializedViewRule_ can 
> rewrite it as a union of the view plus a residual that reads the remaining 
> rows from the base table. Both _MaterializedViewAggregateRule_ and 
> _MaterializedViewJoinRule_ built the residual as _NOT(viewFilter)_.
> If _viewFilter_ is not null-rejecting (_i.e._, it can return _UNKNOWN_), then 
> the view branch keeps the rows evaluating to _TRUE_, the residual keeps only 
> the _FALSE_ rows, and the rows evaluating to _UNKNOWN_ are from both branches 
> and silently lost.
> Proposed fix: build the residual as _viewFilter IS NOT TRUE_, which is 
> two-valued, so the view branch covers _TRUE_, the residual cover 
> _FALSE_/_UNKNOWN_, partitioning the input exactly (no row is lost).
> For a minimal reproducer mimicking the MV rewrite with residuals via unions, 
> consider the following example:
>  
> {noformat}
> WITH emp(salary, commission) AS (
>   VALUES (10000, 1000), (10000, 250), (7000, NULL)
> )
> SELECT
>   (SELECT SUM(salary) FROM emp) AS direct,
>   (SELECT SUM(salary) FROM emp WHERE commission > 400)
>    +
>   (SELECT COALESCE(SUM(salary),0) 
>    FROM emp
>    WHERE NOT (commission > 400)) AS buggy_not,
>   (SELECT SUM(salary) FROM emp WHERE commission > 400)
>    +
>   (SELECT COALESCE(SUM(salary),0)
>    FROM emp
>    WHERE (commission > 400) IS NOT TRUE) AS fixed_isnottrue;{noformat}
> This evaluates to the following result, clearly showing that some rows are 
> lost in the current version, while the proposed fix returns the same result 
> of the original query:
> {noformat}
> | direct | buggy_not | fixed_isnottrue |
> | ------ | --------- | --------------- |
> | 27000  | 20000     | 27000           |{noformat}
> (repro executable for Postgres 18 
> [here|https://www.db-fiddle.com/#&togetherjs=VKfpwV7zUN] with df-fiddle).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to