[jira] [Comment Edited] (CALCITE-5756) Expand ProjectJoinRemoveRule to support inner join removal by using the foreign-unique constraints
[ https://issues.apache.org/jira/browse/CALCITE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17748376#comment-17748376 ] JingDas edited comment on CALCITE-5756 at 7/28/23 3:00 AM: --- This [PR|https://github.com/apache/calcite/pull/3264] contains two features, one is [RelMetadataQuery#getForeignKeys|https://github.com/JingDas/calcite/blob/f1c55fcde9f6a1c25529dd00ebc679e87f57a39f/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L526] which is get foreign key metadata from current relNode the other is expanding [ProjectJoinRemoveRule|https://github.com/JingDas/calcite/blob/f1c55fcde9f6a1c25529dd00ebc679e87f57a39f/core/src/main/java/org/apache/calcite/rel/rules/ProjectJoinRemoveRule.java#L79] to support inner join which depends on feature one. I notice that the code in the PR is too much which mostly is test code, should I split the RP into two PRs one is for feature one and the other is for feature two, or maintain the current situation. If you hava any ideas, please let me know. Thanks. was (Author: JIRAUSER292370): This [PR|https://github.com/apache/calcite/pull/3264] contains two features, one is [RelMetadataQuery#getForeignKeys|https://github.com/JingDas/calcite/blob/f1c55fcde9f6a1c25529dd00ebc679e87f57a39f/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L526] which is get foreign key metadata from current relNode the other is expanding ProjectJoinRemoveRule to support inner join which depends on feature one. I notice that the code in the PR is too much which mostly is test code, should I split the RP into two PRs one is for feature one and the other is for feature two, or maintain the current situation. If you hava any ideas, please let me know. Thanks. > Expand ProjectJoinRemoveRule to support inner join removal by using the > foreign-unique constraints > -- > > Key: CALCITE-5756 > URL: https://issues.apache.org/jira/browse/CALCITE-5756 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: JingDas >Assignee: JingDas >Priority: Major > Labels: pull-request-available > > Join elimination is a useful optmize improvement. > Consider a query that joins the two tables but does not make use of the Dept > columns: > {code:java} > SELECT Emp.name, Emp.salary > FROM Emp, Dept > WHERE Emp.deptno = Dept.dno {code} > Assuming Emp.deptno is the foreign-key and is non-null, Dept.dno is the > unique-key. The sql above can be rewritten as following. remove the Dept > table without affecting the resultset. > {code:java} > SELECT Emp.name, Emp.salary > FROM Emp {code} > Without redundant join elimination, this query execution may perform poorly. > The optimize improvement is also available in SQL Server, Oracle and > Snowflake and so on. > In Calcite, i think that is also useful. The infrastructure that join > elimination depend on is already available. > The main steps are as follows: > 1. Analyse the column used by project, and then split them to left and right > side. > 2. Acccording to the project info above and outer join type, bail out in some > scene. > 3. Get join info such as join keys. > 4. For inner join check foreign and unique keys, these may use > RelMetadataQuery#getForeignKeys(newly add, similar to > RelMetadataQuery#getUniqueKeys), > RelOptTable#getReferentialConstraints. > 5. Check removing side join keys are areColumnsUnique both for outer join and > inner join. > 6. If all done, calculate the fianl project and transform. > Please help me to check the improvement whether is useful or not. > And i would like to add this improvement to Calcite. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5756) Expand ProjectJoinRemoveRule to support inner join removal by using the foreign-unique constraints
[ https://issues.apache.org/jira/browse/CALCITE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17741583#comment-17741583 ] winds edited comment on CALCITE-5756 at 7/10/23 12:15 PM: -- Based on the previous discussion, the `RelOptForeignKey` component has been added to represent the composite or single foreign key and the unique key that foreign key references. was (Author: JIRAUSER292370): Based on the previous discussion, the `RelOptForeignKey` component has been added to represent the composite foreign keys and the unique constraint of foreign key references. > Expand ProjectJoinRemoveRule to support inner join removal by using the > foreign-unique constraints > -- > > Key: CALCITE-5756 > URL: https://issues.apache.org/jira/browse/CALCITE-5756 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: winds >Assignee: winds >Priority: Major > Labels: pull-request-available > > Join elimination is a useful optmize improvement. > Consider a query that joins the two tables but does not make use of the Dept > columns: > {code:java} > SELECT Emp.name, Emp.salary > FROM Emp, Dept > WHERE Emp.deptno = Dept.dno {code} > Assuming Emp.deptno is the foreign-key and is non-null, Dept.dno is the > unique-key. The sql above can be rewritten as following. remove the Dept > table without affecting the resultset. > {code:java} > SELECT Emp.name, Emp.salary > FROM Emp {code} > Without redundant join elimination, this query execution may perform poorly. > The optimize improvement is also available in SQL Server, Oracle and > Snowflake and so on. > In Calcite, i think that is also useful. The infrastructure that join > elimination depend on is already available. > The main steps are as follows: > 1. Analyse the column used by project, and then split them to left and right > side. > 2. Acccording to the project info above and outer join type, bail out in some > scene. > 3. Get join info such as join keys. > 4. For inner join check foreign and unique keys, these may use > RelMetadataQuery#getForeignKeys(newly add, similar to > RelMetadataQuery#getUniqueKeys), > RelOptTable#getReferentialConstraints. > 5. Check removing side join keys are areColumnsUnique both for outer join and > inner join. > 6. If all done, calculate the fianl project and transform. > Please help me to check the improvement whether is useful or not. > And i would like to add this improvement to Calcite. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5756) Expand ProjectJoinRemoveRule to support inner join removal by using the foreign-unique constraints
[ https://issues.apache.org/jira/browse/CALCITE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17736942#comment-17736942 ] winds edited comment on CALCITE-5756 at 6/28/23 11:17 PM: -- [~jhyde] Thanks for your check. I aggre with you. As you said, it is reasonable that foreignKeys returning should include a. what table each column references b. composite foreign keys. I have some ideas to make RelMetadataQuery#getForeignKeys to satisfy the above, I will try to fix it. was (Author: JIRAUSER292370): [~jhyde] Thanks for your check. I aggre with you. As you said, it is reasonable that foreignKeys returning should include a. what table each column references b. composite foreign keys. I have some ideas to make RelMetadataQuery#getForeignKeys to satisfy the above , I will try to fix it. > Expand ProjectJoinRemoveRule to support inner join removal by using the > foreign-unique constraints > -- > > Key: CALCITE-5756 > URL: https://issues.apache.org/jira/browse/CALCITE-5756 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: winds >Assignee: winds >Priority: Major > Labels: pull-request-available > > Join elimination is a useful optmize improvement. > Consider a query that joins the two tables but does not make use of the Dept > columns: > {code:java} > SELECT Emp.name, Emp.salary > FROM Emp, Dept > WHERE Emp.deptno = Dept.dno {code} > Assuming Emp.deptno is the foreign-key and is non-null, Dept.dno is the > unique-key. The sql above can be rewritten as following. remove the Dept > table without affecting the resultset. > {code:java} > SELECT Emp.name, Emp.salary > FROM Emp {code} > Without redundant join elimination, this query execution may perform poorly. > The optimize improvement is also available in SQL Server, Oracle and > Snowflake and so on. > In Calcite, i think that is also useful. The infrastructure that join > elimination depend on is already available. > The main steps are as follows: > 1. Analyse the column used by project, and then split them to left and right > side. > 2. Acccording to the project info above and outer join type, bail out in some > scene. > 3. Get join info such as join keys. > 4. For inner join check foreign and unique keys, these may use > RelMetadataQuery#getForeignKeys(newly add, similar to > RelMetadataQuery#getUniqueKeys), > RelOptTable#getReferentialConstraints. > 5. Check removing side join keys are areColumnsUnique both for outer join and > inner join. > 6. If all done, calculate the fianl project and transform. > Please help me to check the improvement whether is useful or not. > And i would like to add this improvement to Calcite. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5756) Expand ProjectJoinRemoveRule to support inner join removal by using the foreign-unique constraints
[ https://issues.apache.org/jira/browse/CALCITE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17736942#comment-17736942 ] winds edited comment on CALCITE-5756 at 6/28/23 11:17 PM: -- [~jhyde] Thanks for your check. I aggre with you. As you said, it is reasonable that foreignKeys returning should include a. what table each column references b. composite foreign keys. I have some ideas to make RelMetadataQuery#getForeignKeys to satisfy the above , I will try to fix it. was (Author: JIRAUSER292370): Thanks for your check. Get your point, my understanding is as follows: Firstly. For the following sql, assumption there are constraints that Emp.deptno foreign key reference Dept.deptno unique key, Emp.ename foreign key reference Dept.deptno unique key. {code:java} SELECT e.ename, e.deptno, DEPT.name FROM (SELECT Emp.ename, Emp.deptno FROM Emp) e INNER JOIN Dept ON e.deptno on Dept.deptno{code} It seems that There are two ways to get constraints on current rel node. a. RelMetadataQuery#getForeignKeys mark out which position is the valid foreign key(if the position is on aggregate function, it seems invalid), if we want to know which unqinue key the foreign key reference, we can get the relation by RelMetadataQuery#getColumnOrigin,such as the logic in method [ProjectJoinRemoveRule#areForeignKeysValid()|https://github.com/JingDas/calcite/blob/586dbf40d6ef7752b554c08fe573e600da456876/core/src/main/java/org/apache/calcite/rel/rules/ProjectJoinRemoveRule.java#L170]. b. If we want to get all constraint information from RelMetadataQuery#getForeignKeys, it seems that RelMetadataQuery#getForeignKeys should return Set>, which represent that foreign key positions reference unique key positions correspondingly on current relNode. As above sql and logic, if we call the method RelMetadataQuery#getForeignKeys(Project(Emp.ename, Emp.deptno), true), it seems return [<[Emp.ename, Emp.deptno], null>], because it reference another table Dept, but the Dept does not appear in the current project relnode. To be precise, I have no more ideas for that how to keep track and record this constraint relationship during the bottom-up transfer derivation process of rel node. So I took the first approach above. Secondly. I just searched some docs. And find that in some database such as Mysql, SqlServer and so on, Foreign Key are allowed to be composited.For the method RelMetadataQuery#getForeignKeys, it seems to return Set which can represent the information for composite foreign keys, I will use Set to fix the composite foreign keys representation. WDYT? > Expand ProjectJoinRemoveRule to support inner join removal by using the > foreign-unique constraints > -- > > Key: CALCITE-5756 > URL: https://issues.apache.org/jira/browse/CALCITE-5756 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: winds >Assignee: winds >Priority: Major > Labels: pull-request-available > > Join elimination is a useful optmize improvement. > Consider a query that joins the two tables but does not make use of the Dept > columns: > {code:java} > SELECT Emp.name, Emp.salary > FROM Emp, Dept > WHERE Emp.deptno = Dept.dno {code} > Assuming Emp.deptno is the foreign-key and is non-null, Dept.dno is the > unique-key. The sql above can be rewritten as following. remove the Dept > table without affecting the resultset. > {code:java} > SELECT Emp.name, Emp.salary > FROM Emp {code} > Without redundant join elimination, this query execution may perform poorly. > The optimize improvement is also available in SQL Server, Oracle and > Snowflake and so on. > In Calcite, i think that is also useful. The infrastructure that join > elimination depend on is already available. > The main steps are as follows: > 1. Analyse the column used by project, and then split them to left and right > side. > 2. Acccording to the project info above and outer join type, bail out in some > scene. > 3. Get join info such as join keys. > 4. For inner join check foreign and unique keys, these may use > RelMetadataQuery#getForeignKeys(newly add, similar to > RelMetadataQuery#getUniqueKeys), > RelOptTable#getReferentialConstraints. > 5. Check removing side join keys are areColumnsUnique both for outer join and > inner join. > 6. If all done, calculate the fianl project and transform. > Please help me to check the improvement whether is useful or not. > And i would like to add this improvement to Calcite. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5756) Expand ProjectJoinRemoveRule to support inner join removal by using the foreign-unique constraints
[ https://issues.apache.org/jira/browse/CALCITE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17736942#comment-17736942 ] winds edited comment on CALCITE-5756 at 6/26/23 12:42 AM: -- Thanks for your check. Get your point, my understanding is as follows: Firstly. For the following sql, assumption there are constraints that Emp.deptno foreign key reference Dept.deptno unique key, Emp.ename foreign key reference Dept.deptno unique key. {code:java} SELECT e.ename, e.deptno, DEPT.name FROM (SELECT Emp.ename, Emp.deptno FROM Emp) e INNER JOIN Dept ON e.deptno on Dept.deptno{code} It seems that There are two ways to get constraints on current rel node. a. RelMetadataQuery#getForeignKeys mark out which position is the valid foreign key(if the position is on aggregate function, it seems invalid), if we want to know which unqinue key the foreign key reference, we can get the relation by RelMetadataQuery#getColumnOrigin,such as the logic in method [ProjectJoinRemoveRule#areForeignKeysValid()|https://github.com/JingDas/calcite/blob/586dbf40d6ef7752b554c08fe573e600da456876/core/src/main/java/org/apache/calcite/rel/rules/ProjectJoinRemoveRule.java#L170]. b. If we want to get all constraint information from RelMetadataQuery#getForeignKeys, it seems that RelMetadataQuery#getForeignKeys should return Set>, which represent that foreign key positions reference unique key positions correspondingly on current relNode. As above sql and logic, if we call the method RelMetadataQuery#getForeignKeys(Project(Emp.ename, Emp.deptno), true), it seems return [<[Emp.ename, Emp.deptno], null>], because it reference another table Dept, but the Dept does not appear in the current project relnode. To be precise, I have no more ideas for that how to keep track and record this constraint relationship during the bottom-up transfer derivation process of rel node. So I took the first approach above. Secondly. I just searched some docs. And find that in some database such as Mysql, SqlServer and so on, Foreign Key are allowed to be composited.For the method RelMetadataQuery#getForeignKeys, it seems to return Set which can represent the information for composite foreign keys, I will use Set to fix the composite foreign keys representation. WDYT? was (Author: JIRAUSER292370): Thanks for your check. Get your point, my understanding is as follows: Firstly. For the following sql, assumption there are constraints that Emp.deptno foreign key reference Dept.deptno unique key, Emp.ename foreign key reference Dept.deptno unique key. {code:java} SELECT e.ename, e.deptno, DEPT.name FROM (SELECT Emp.ename, Emp.deptno FROM Emp) e INNER JOIN Dept ON e.deptno on Dept.deptno{code} It seems that There are two ways to get constraints on current rel node. a. RelMetadataQuery#getForeignKeys mark out which position is the valid foreign key(if the position is on aggregate function, it seems invalid), if we want to know which unqinue key the foreign key reference, we can get the relation by RelMetadataQuery#getColumnOrigin,such as the logic in method [ProjectJoinRemoveRule#areForeignKeysValid()|https://github.com/JingDas/calcite/blob/586dbf40d6ef7752b554c08fe573e600da456876/core/src/main/java/org/apache/calcite/rel/rules/ProjectJoinRemoveRule.java#L170]. b. If we want to get all constraint information from RelMetadataQuery#getForeignKeys, it seems that RelMetadataQuery#getForeignKeys should return Set>, which represent that foreign key positions reference unique key positions correspondingly on current relNode. As above sql and logic, if we call the method RelMetadataQuery#getForeignKeys(Project(Emp.ename, Emp.deptno), true), it seems return [<[Emp.ename, Emp.deptno], null>], because it reference another table Dept, but the Dept does not appear in the current project relnode. To be precise, I don't have ideas for that how to keep track and record this constraint relationship during the bottom-up transfer derivation process of rel node. So I took the first approach above. Secondly. I just searched some docs. And find that in some database such as Mysql, SqlServer and so on, Foreign Key are allowed to be composited.For the method RelMetadataQuery#getForeignKeys, it seems to return Set which can represent the information for composite foreign keys, I will use Set to fix the composite foreign keys representation. WDYT? > Expand ProjectJoinRemoveRule to support inner join removal by using the > foreign-unique constraints > -- > > Key: CALCITE-5756 > URL: https://issues.apache.org/jira/browse/CALCITE-5756 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: winds >Assignee: winds >Priority: Majo
[jira] [Comment Edited] (CALCITE-5756) Expand ProjectJoinRemoveRule to support inner join removal by using the foreign-unique constraints
[ https://issues.apache.org/jira/browse/CALCITE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17736942#comment-17736942 ] winds edited comment on CALCITE-5756 at 6/26/23 12:29 AM: -- Thanks for your check. Get your point, my understanding is as follows: Firstly. For the following sql, assumption there are constraints that Emp.deptno foreign key reference Dept.deptno unique key, Emp.ename foreign key reference Dept.deptno unique key. {code:java} SELECT e.ename, e.deptno, DEPT.name FROM (SELECT Emp.ename, Emp.deptno FROM Emp) e INNER JOIN Dept ON e.deptno on Dept.deptno{code} It seems that There are two ways to get constraints on current rel node. a. RelMetadataQuery#getForeignKeys mark out which position is the valid foreign key(if the position is on aggregate function, it seems invalid), if we want to know which unqinue key the foreign key reference, we can get the relation by RelMetadataQuery#getColumnOrigin,such as the logic in method [ProjectJoinRemoveRule#areForeignKeysValid()|https://github.com/JingDas/calcite/blob/586dbf40d6ef7752b554c08fe573e600da456876/core/src/main/java/org/apache/calcite/rel/rules/ProjectJoinRemoveRule.java#L170]. b. If we want to get all constraint information from RelMetadataQuery#getForeignKeys, it seems that RelMetadataQuery#getForeignKeys should return Set>, which represent that foreign key positions reference unique key positions correspondingly on current relNode. As above sql and logic, if we call the method RelMetadataQuery#getForeignKeys(Project(Emp.ename, Emp.deptno), true), it seems return [<[Emp.ename, Emp.deptno], null>], because it reference another table Dept, but the Dept does not appear in the current project relnode. To be precise, I don't have ideas for that how to keep track and record this constraint relationship during the bottom-up transfer derivation process of rel node. So I took the first approach above. Secondly. I just searched some docs. And find that in some database such as Mysql, SqlServer and so on, Foreign Key are allowed to be composited.For the method RelMetadataQuery#getForeignKeys, it seems to return Set which can represent the information for composite foreign keys, I will use Set to fix the composite foreign keys representation. WDYT? was (Author: JIRAUSER292370): Thanks for your check. Get your point, my understanding is as follows: Firstly. For the following sql, assumption there are constraints that Emp.deptno foreign key reference Dept.deptno unique key, Emp.ename foreign key reference Dept.deptno unique key. {code:java} SELECT e.ename, e.deptno, DEPT.name FROM (SELEECT Emp.ename, Emp.deptno FROM Emp) e INNER JOIN Dept ON e.deptno on Dept.deptno{code} It seems that There are two ways to get constraints on current rel node. a. RelMetadataQuery#getForeignKeys mark out which position is the valid foreign key(if the position is on aggregate function, it seems invalid), if we want to know which unqinue key the foreign key reference, we can get the relation by RelMetadataQuery#getColumnOrigin,such as the logic in method [ProjectJoinRemoveRule#areForeignKeysValid()|https://github.com/JingDas/calcite/blob/586dbf40d6ef7752b554c08fe573e600da456876/core/src/main/java/org/apache/calcite/rel/rules/ProjectJoinRemoveRule.java#L170]. b. If we want to get all constraint information from RelMetadataQuery#getForeignKeys, it seems that RelMetadataQuery#getForeignKeys should return Set>, which represent that foreign key positions reference unique key positions correspondingly on current relNode. As above sql and logic, if we call the method RelMetadataQuery#getForeignKeys(Project(Emp.ename, Emp.deptno), true), it seems return [<[Emp.ename, Emp.deptno], null>], because it reference another table Dept, but the Dept does not appear in the current project relnode. To be precise, I don't have ideas for that how to keep track and record this constraint relationship during the bottom-up transfer derivation process of rel node. So I took the first approach above. Secondly. I just searched some docs. And find that in some database such as Mysql, SqlServer and so on, Foreign Key are allowed to be composited.For the method RelMetadataQuery#getForeignKeys, it seems to return Set which can represent the information for composite foreign keys, I will use Set to fix the composite foreign keys representation. WDYT? > Expand ProjectJoinRemoveRule to support inner join removal by using the > foreign-unique constraints > -- > > Key: CALCITE-5756 > URL: https://issues.apache.org/jira/browse/CALCITE-5756 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: winds >Assignee: winds >Priority: Majo