[ https://issues.apache.org/jira/browse/CALCITE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17736942#comment-17736942 ]
winds edited comment on CALCITE-5756 at 6/26/23 12:42 AM: ---------------------------------------------------------- Thanks for your check. Get your point, my understanding is as follows: Firstly. For the following sql, assumption there are constraints that Emp.deptno foreign key reference Dept.deptno unique key, Emp.ename foreign key reference Dept.deptno unique key. {code:java} SELECT e.ename, e.deptno, DEPT.name FROM (SELECT Emp.ename, Emp.deptno FROM Emp) e INNER JOIN Dept ON e.deptno on Dept.deptno{code} It seems that There are two ways to get constraints on current rel node. a. RelMetadataQuery#getForeignKeys mark out which position is the valid foreign key(if the position is on aggregate function, it seems invalid), if we want to know which unqinue key the foreign key reference, we can get the relation by RelMetadataQuery#getColumnOrigin,such as the logic in method [ProjectJoinRemoveRule#areForeignKeysValid()|https://github.com/JingDas/calcite/blob/586dbf40d6ef7752b554c08fe573e600da456876/core/src/main/java/org/apache/calcite/rel/rules/ProjectJoinRemoveRule.java#L170]. b. If we want to get all constraint information from RelMetadataQuery#getForeignKeys, it seems that RelMetadataQuery#getForeignKeys should return Set<Pair<ImmutableBitSet, ImmutableBitSet>>, which represent that foreign key positions reference unique key positions correspondingly on current relNode. As above sql and logic, if we call the method RelMetadataQuery#getForeignKeys(Project(Emp.ename, Emp.deptno), true), it seems return [<[Emp.ename, Emp.deptno], null>], because it reference another table Dept, but the Dept does not appear in the current project relnode. To be precise, I have no more ideas for that how to keep track and record this constraint relationship during the bottom-up transfer derivation process of rel node. So I took the first approach above. Secondly. I just searched some docs. And find that in some database such as Mysql, SqlServer and so on, Foreign Key are allowed to be composited.For the method RelMetadataQuery#getForeignKeys, it seems to return Set<ImmutableBitSet> which can represent the information for composite foreign keys, I will use Set<ImmutableBitSet> to fix the composite foreign keys representation. WDYT? was (Author: JIRAUSER292370): Thanks for your check. Get your point, my understanding is as follows: Firstly. For the following sql, assumption there are constraints that Emp.deptno foreign key reference Dept.deptno unique key, Emp.ename foreign key reference Dept.deptno unique key. {code:java} SELECT e.ename, e.deptno, DEPT.name FROM (SELECT Emp.ename, Emp.deptno FROM Emp) e INNER JOIN Dept ON e.deptno on Dept.deptno{code} It seems that There are two ways to get constraints on current rel node. a. RelMetadataQuery#getForeignKeys mark out which position is the valid foreign key(if the position is on aggregate function, it seems invalid), if we want to know which unqinue key the foreign key reference, we can get the relation by RelMetadataQuery#getColumnOrigin,such as the logic in method [ProjectJoinRemoveRule#areForeignKeysValid()|https://github.com/JingDas/calcite/blob/586dbf40d6ef7752b554c08fe573e600da456876/core/src/main/java/org/apache/calcite/rel/rules/ProjectJoinRemoveRule.java#L170]. b. If we want to get all constraint information from RelMetadataQuery#getForeignKeys, it seems that RelMetadataQuery#getForeignKeys should return Set<Pair<ImmutableBitSet, ImmutableBitSet>>, which represent that foreign key positions reference unique key positions correspondingly on current relNode. As above sql and logic, if we call the method RelMetadataQuery#getForeignKeys(Project(Emp.ename, Emp.deptno), true), it seems return [<[Emp.ename, Emp.deptno], null>], because it reference another table Dept, but the Dept does not appear in the current project relnode. To be precise, I don't have ideas for that how to keep track and record this constraint relationship during the bottom-up transfer derivation process of rel node. So I took the first approach above. Secondly. I just searched some docs. And find that in some database such as Mysql, SqlServer and so on, Foreign Key are allowed to be composited.For the method RelMetadataQuery#getForeignKeys, it seems to return Set<ImmutableBitSet> which can represent the information for composite foreign keys, I will use Set<ImmutableBitSet> to fix the composite foreign keys representation. WDYT? > Expand ProjectJoinRemoveRule to support inner join removal by using the > foreign-unique constraints > -------------------------------------------------------------------------------------------------- > > Key: CALCITE-5756 > URL: https://issues.apache.org/jira/browse/CALCITE-5756 > Project: Calcite > Issue Type: Improvement > Components: core > Reporter: winds > Assignee: winds > Priority: Major > Labels: pull-request-available > Fix For: 1.35.0 > > > Join elimination is a useful optmize improvement. > Consider a query that joins the two tables but does not make use of the Dept > columns: > {code:java} > SELECT Emp.name, Emp.salary > FROM Emp, Dept > WHERE Emp.deptno = Dept.dno {code} > Assuming Emp.deptno is the foreign-key and is non-null, Dept.dno is the > unique-key. The sql above can be rewritten as following. remove the Dept > table without affecting the resultset. > {code:java} > SELECT Emp.name, Emp.salary > FROM Emp {code} > Without redundant join elimination, this query execution may perform poorly. > The optimize improvement is also available in SQL Server, Oracle and > Snowflake and so on. > In Calcite, i think that is also useful. The infrastructure that join > elimination depend on is already available. > The main steps are as follows: > 1. Analyse the column used by project, and then split them to left and right > side. > 2. Acccording to the project info above and outer join type, bail out in some > scene. > 3. Get join info such as join keys. > 4. For inner join check foreign and unique keys, these may use > RelMetadataQuery#getForeignKeys(newly add, similar to > RelMetadataQuery#getUniqueKeys), > RelOptTable#getReferentialConstraints. > 5. Check removing side join keys are areColumnsUnique both for outer join and > inner join. > 6. If all done, calculate the fianl project and transform. > Please help me to check the improvement whether is useful or not. > And i would like to add this improvement to Calcite. -- This message was sent by Atlassian Jira (v8.20.10#820010)