[ 
https://issues.apache.org/jira/browse/CALCITE-5646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713104#comment-17713104
 ] 

Stamatis Zampetakis commented on CALCITE-5646:
----------------------------------------------

I agree with [~lchistov1987] that the problem seems to be in the rule and not 
in the Strong class.

Conceptually I would say that the following holds:
{noformat}
COALESCE(t1.name, t2.name) is strong in {t1.name, t2.name} 
COALESCE(t1.name, t2.name) is not strong in {t1.name} 
COALESCE(t1.name, t2.name) is not strong in {t2.name} 
COALESCE(t1.name, t2.name) is not strong in {}
{noformat}
 The rule currently checks the first case (considers null all 
inputs/attributes) and given that the predicate is deemed strong under {t1,t2} 
it wrongly infers that: IS NOT NULL(t1.name) and IS NOT NULL(t2.name)

I think it makes sense to consider the inputs in isolation as [~lchistov1987] 
did in the PR.

Few other general comments about Strong class.

The definition of Strong/Null rejecting seems to defer a bit from the database 
literature. The following definition is taken from "On the Correct and Complete 
Enumeration of the Core Search Space":
{quote}Definition 1. A predicate is null rejecting for a set of attributes A if 
it evaluates to false or unknown on every tuple in which all attributes in A 
are null.{quote}
I think this definition coincides with what Strong#isNotTrue method is doing 
but not with Strong#isStrong and this caused a bit of confusion for me.

The current javadoc of Strong#isNotTrue contains some contradiction; 
"definitely not return true... equivalently, will definitely not return null or 
false". Suggested rephrase below:
{code:java}
/** Returns whether the expression evaluates to false or null when all 
specified columns are null. */
{code}

The current implementation of Strong#isStrong is not different from 
Strong#isNull yet they have a kind of different javadoc/definition. I think we 
would be better if we removed one of them.

Probably we can defer javadoc/refactoring/definition comments in other PRs but 
I though it would be good to discuss now since more than one people have the 
context around these classes on our mind.

> JoinDeriveIsNotNullFilterRule incorrectly handles COALESCE in join condition
> ----------------------------------------------------------------------------
>
>                 Key: CALCITE-5646
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5646
>             Project: Calcite
>          Issue Type: Bug
>    Affects Versions: 1.34.0
>            Reporter: Leonid Chistov
>            Assignee: Leonid Chistov
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Consider query
> {code:java}
> select t1.deptno from empnullables t1 inner join
> empnullables t2 on coalesce(t1.ename, t2.ename) = 'abc' {code}
> When JoinDeriveIsNotNullFilterRule is applied to it, it is incorrectly 
> transformed to query plan
> {code:java}
> LogicalProject(DEPTNO=[$7])
>  LogicalJoin(condition=[=(CASE(IS NOT NULL($1), $1, $10), 'abc')], 
> joinType=[inner])
>    LogicalFilter(condition=[IS NOT NULL($1)])
>      LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]])
>    LogicalFilter(condition=[IS NOT NULL($1)])
>      LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]]) {code}
> It is not valid to deduce that join keys from the both sides cannot have null 
> values. All that we can deduce from the join condition, is that they cannot 
> be null in the same time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to