[ 
https://issues.apache.org/jira/browse/CALCITE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17736942#comment-17736942
 ] 

winds edited comment on CALCITE-5756 at 6/26/23 12:42 AM:
----------------------------------------------------------

Thanks for your check. Get your point, my understanding is as follows:

Firstly.
For the following sql, assumption there are constraints that Emp.deptno foreign 
key reference Dept.deptno unique key, 
Emp.ename foreign key reference Dept.deptno unique key.
{code:java}
SELECT e.ename, e.deptno, DEPT.name
FROM
   (SELECT Emp.ename, Emp.deptno FROM Emp) e
INNER JOIN
Dept
ON e.deptno on Dept.deptno{code}
It seems that There are two ways to get constraints on current rel node.

a. 
RelMetadataQuery#getForeignKeys mark out which position is the valid foreign 
key(if the position is on aggregate function, it seems invalid),
if we want to know which unqinue key the foreign key reference, we can get the 
relation by RelMetadataQuery#getColumnOrigin,such as the logic in method 
[ProjectJoinRemoveRule#areForeignKeysValid()|https://github.com/JingDas/calcite/blob/586dbf40d6ef7752b554c08fe573e600da456876/core/src/main/java/org/apache/calcite/rel/rules/ProjectJoinRemoveRule.java#L170].

b. 
If we want to get all constraint information from 
RelMetadataQuery#getForeignKeys, it seems that
RelMetadataQuery#getForeignKeys should return Set<Pair<ImmutableBitSet, 
ImmutableBitSet>>, which represent that
foreign key positions reference unique key positions correspondingly on current 
relNode.

As above sql and logic, if we call the method 
RelMetadataQuery#getForeignKeys(Project(Emp.ename, Emp.deptno),  true),
it seems return [<[Emp.ename, Emp.deptno], null>], because it reference another 
table Dept, but the Dept does not appear
in the current project relnode.

To be precise, I have no more ideas for that how to keep track and record this 
constraint relationship 
during the bottom-up transfer derivation process of rel node. So I took the 
first approach above.

Secondly.
I just searched some docs. And find that in some database such as Mysql, 
SqlServer and so on, Foreign Key are allowed to be
composited.For the method RelMetadataQuery#getForeignKeys, it seems to return 
Set<ImmutableBitSet> which can represent the information
for composite foreign keys, I will use Set<ImmutableBitSet> to fix the 
composite foreign keys representation.

WDYT?


was (Author: JIRAUSER292370):
Thanks for your check. Get your point, my understanding is as follows:

Firstly.
For the following sql, assumption there are constraints that Emp.deptno foreign 
key reference Dept.deptno unique key, 
Emp.ename foreign key reference Dept.deptno unique key.
{code:java}
SELECT e.ename, e.deptno, DEPT.name
FROM
   (SELECT Emp.ename, Emp.deptno FROM Emp) e
INNER JOIN
Dept
ON e.deptno on Dept.deptno{code}
It seems that There are two ways to get constraints on current rel node.

a. 
RelMetadataQuery#getForeignKeys mark out which position is the valid foreign 
key(if the position is on aggregate function, it seems invalid),
if we want to know which unqinue key the foreign key reference, we can get the 
relation by RelMetadataQuery#getColumnOrigin,such as the logic in method 
[ProjectJoinRemoveRule#areForeignKeysValid()|https://github.com/JingDas/calcite/blob/586dbf40d6ef7752b554c08fe573e600da456876/core/src/main/java/org/apache/calcite/rel/rules/ProjectJoinRemoveRule.java#L170].

b. 
If we want to get all constraint information from 
RelMetadataQuery#getForeignKeys, it seems that
RelMetadataQuery#getForeignKeys should return Set<Pair<ImmutableBitSet, 
ImmutableBitSet>>, which represent that
foreign key positions reference unique key positions correspondingly on current 
relNode.

As above sql and logic, if we call the method 
RelMetadataQuery#getForeignKeys(Project(Emp.ename, Emp.deptno),  true),
it seems return [<[Emp.ename, Emp.deptno], null>], because it reference another 
table Dept, but the Dept does not appear
in the current project relnode.

To be precise, I don't have ideas for that how to keep track and record this 
constraint relationship 
during the bottom-up transfer derivation process of rel node. So I took the 
first approach above.

Secondly.
I just searched some docs. And find that in some database such as Mysql, 
SqlServer and so on, Foreign Key are allowed to be
composited.For the method RelMetadataQuery#getForeignKeys, it seems to return 
Set<ImmutableBitSet> which can represent the information
for composite foreign keys, I will use Set<ImmutableBitSet> to fix the 
composite foreign keys representation.

WDYT?

> Expand ProjectJoinRemoveRule to support inner join removal by using the 
> foreign-unique constraints
> --------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-5756
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5756
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: winds
>            Assignee: winds
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.35.0
>
>
> Join elimination is a useful optmize improvement. 
> Consider a query that joins the two tables but does not make use of the Dept 
> columns:
> {code:java}
> SELECT Emp.name, Emp.salary
> FROM Emp, Dept
> WHERE Emp.deptno = Dept.dno {code}
> Assuming Emp.deptno is the foreign-key and is non-null, Dept.dno is the 
> unique-key. The sql above can be rewritten as following. remove the Dept 
> table without affecting the resultset.
> {code:java}
> SELECT Emp.name, Emp.salary
> FROM Emp {code}
> Without redundant join elimination, this query execution may perform poorly.
> The optimize improvement is also available in SQL Server, Oracle and 
> Snowflake and so on.
> In Calcite, i think that is also useful. The infrastructure that join 
> elimination depend on is already available.
> The main steps are as follows:
> 1. Analyse the column used by project, and then split them to left and right 
> side.
> 2. Acccording to the project info above and outer join type, bail out in some 
> scene.
> 3. Get join info such as join keys.
> 4. For inner join check foreign and unique keys, these may use
> RelMetadataQuery#getForeignKeys(newly add, similar to 
> RelMetadataQuery#getUniqueKeys),
> RelOptTable#getReferentialConstraints.
> 5. Check removing side join keys are areColumnsUnique both for outer join and 
> inner join.
> 6. If all done, calculate the fianl project and transform. 
> Please help me to check the improvement whether is useful or not.
> And i would like to add this improvement to Calcite.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to