[ https://issues.apache.org/jira/browse/CALCITE-5646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713104#comment-17713104 ]
Stamatis Zampetakis commented on CALCITE-5646: ---------------------------------------------- I agree with [~lchistov1987] that the problem seems to be in the rule and not in the Strong class. Conceptually I would say that the following holds: {noformat} COALESCE(t1.name, t2.name) is strong in {t1.name, t2.name} COALESCE(t1.name, t2.name) is not strong in {t1.name} COALESCE(t1.name, t2.name) is not strong in {t2.name} COALESCE(t1.name, t2.name) is not strong in {} {noformat} The rule currently checks the first case (considers null all inputs/attributes) and given that the predicate is deemed strong under {t1,t2} it wrongly infers that: IS NOT NULL(t1.name) and IS NOT NULL(t2.name) I think it makes sense to consider the inputs in isolation as [~lchistov1987] did in the PR. Few other general comments about Strong class. The definition of Strong/Null rejecting seems to defer a bit from the database literature. The following definition is taken from "On the Correct and Complete Enumeration of the Core Search Space": {quote}Definition 1. A predicate is null rejecting for a set of attributes A if it evaluates to false or unknown on every tuple in which all attributes in A are null.{quote} I think this definition coincides with what Strong#isNotTrue method is doing but not with Strong#isStrong and this caused a bit of confusion for me. The current javadoc of Strong#isNotTrue contains some contradiction; "definitely not return true... equivalently, will definitely not return null or false". Suggested rephrase below: {code:java} /** Returns whether the expression evaluates to false or null when all specified columns are null. */ {code} The current implementation of Strong#isStrong is not different from Strong#isNull yet they have a kind of different javadoc/definition. I think we would be better if we removed one of them. Probably we can defer javadoc/refactoring/definition comments in other PRs but I though it would be good to discuss now since more than one people have the context around these classes on our mind. > JoinDeriveIsNotNullFilterRule incorrectly handles COALESCE in join condition > ---------------------------------------------------------------------------- > > Key: CALCITE-5646 > URL: https://issues.apache.org/jira/browse/CALCITE-5646 > Project: Calcite > Issue Type: Bug > Affects Versions: 1.34.0 > Reporter: Leonid Chistov > Assignee: Leonid Chistov > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Consider query > {code:java} > select t1.deptno from empnullables t1 inner join > empnullables t2 on coalesce(t1.ename, t2.ename) = 'abc' {code} > When JoinDeriveIsNotNullFilterRule is applied to it, it is incorrectly > transformed to query plan > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[=(CASE(IS NOT NULL($1), $1, $10), 'abc')], > joinType=[inner]) > LogicalFilter(condition=[IS NOT NULL($1)]) > LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]]) > LogicalFilter(condition=[IS NOT NULL($1)]) > LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]]) {code} > It is not valid to deduce that join keys from the both sides cannot have null > values. All that we can deduce from the join condition, is that they cannot > be null in the same time. -- This message was sent by Atlassian Jira (v8.20.10#820010)