Claude Brisson created CALCITE-7199:
---------------------------------------

             Summary: RelMdColumnUniqueness joins handler is invalid for 
columns in both sides
                 Key: CALCITE-7199
                 URL: https://issues.apache.org/jira/browse/CALCITE-7199
             Project: Calcite
          Issue Type: Bug
          Components: core
    Affects Versions: 1.40.0
            Reporter: Claude Brisson


{{RelMdColumnUniqueness.areColumnsUnique()}} handler for joins contains the 
following code:

{code}
    // If the original column mask contains columns from both the left and
    // right hand side, then the columns are unique if and only if they're
    // unique for their respective join inputs
    Boolean leftUnique = mq.areColumnsUnique(left, leftColumns, ignoreNulls);
    Boolean rightUnique = mq.areColumnsUnique(right, rightColumns, ignoreNulls);
    if ((leftColumns.cardinality() > 0)
        && (rightColumns.cardinality() > 0)) {
      if ((leftUnique == null) || (rightUnique == null)) {
        return null;
      } else {
        return leftUnique && rightUnique;
      }
    }
{code}

This is not correct. Uniqueness on both sides is a sufficient condition for the 
columns to be unique in the join result, but not a necessary one.

The columns will also be unique in the joins result if the following conditions 
are all met:

* the queried columns coming from the left input are unique
* the columns implied in the right side of the join equi-condition are unique
* the join does not generate nulls on the left

(and the same with left and right reversed).

The fact that the join is done on a unique key on the right side will guarantee 
that the uniqueness of the queried columns on the left is preserved, and 
whatever columns we add in the queried subset, it will remain unique.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to