Asif created SPARK-40362:
----------------------------

             Summary: Bug in Canonicalization of expressions like Add & 
Multiply i.e Commutative Operators
                 Key: SPARK-40362
                 URL: https://issues.apache.org/jira/browse/SPARK-40362
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.3.0
            Reporter: Asif


In the canonicalization code which is now in two stages, canonicalization 
involving Commutative operators is broken, if they are subexpressions of 
certain type of expressions which override precanonicalize.

Consider following expression:

a + b > 10

This GreaterThan expression when canonicalized as a whole for first time,  will 
skip the call to Canonicalize.reorderCommutativeOperators for the Add 
expression as the GreaterThan's canonicalization used precanonicalize on 
children ( the Add expression).

 

so if create a new expression 

b + a > 10 and invoke canonicalize it, the canonicalized versions of these two 
expressions will not match.

The test 
{quote}
test("bug X") {
    val tr1 = LocalRelation('c.int, 'b.string, 'a.int)

   val y = tr1.where('a.attr + 'c.attr > 10).analyze
   val fullCond = y.asInstanceOf[Filter].condition.clone()
  val addExpr = (fullCond match {
    case GreaterThan(x: Add, _) => x
    case LessThan(_, x: Add) => x
  }).clone().asInstanceOf[Add]
val canonicalizedFullCond = fullCond.canonicalized

// swap the operands of add
val newAddExpr = Add(addExpr.right, addExpr.left)

// build a new condition which is same as the previous one, but with operands 
of //Add reversed 

val builtCondnCanonicalized = GreaterThan(newAddExpr, Literal(10)).canonicalized

assertEquals(canonicalizedFullCond, builtCondnCanonicalized)
}
{quote}

This test fails.

The fix which I propose is that for the commutative expressions,  the 
precanonicalize should be overridden and   
Canonicalize.reorderCommutativeOperators be invoked on the expression instead 
of at place of canonicalize.  effectively for commutative operands ( add, or , 
multiply , and etc)  canonicalize and precanonicalize should be same.
I will file a PR for it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to