[ 
https://issues.apache.org/jira/browse/SPARK-40362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17601066#comment-17601066
 ] 

Asif edited comment on SPARK-40362 at 9/7/22 12:42 AM:
-------------------------------------------------------

I have added locally test for 

&&, ||, &, | and *  and they all fail for the same reason..

in process of adding tests for xor, greatest & least and testing the fix for 
the same...

 

the fix is to make these expressions implement a trait

 
{quote}trait CommutativeExpresionCanonicalization {
this: Expression =>
override lazy val canonicalized: Expression = preCanonicalized
override lazy val preCanonicalized: Expression = {
val canonicalizedChildren = children.map(_.preCanonicalized)
Canonicalize.reorderCommutativeOperators {
withNewChildren(canonicalizedChildren)
}
}
}{quote}


was (Author: ashahid7):
I have added locally test for 

&&, ||, &, | and *  and they all fail for the same reason..

in process of adding tests for xor, greatest & least and testing the fix for 
the same...

 

the fix is to make these expressions implement a trait

 
{quote}trait CommutativeExpresionCanonicalization {
  this: Expression =>
  override lazy val canonicalized: Expression = preCanonicalized
  override lazy val preCanonicalized: Expression = {
    val canonicalizedChildren = children.map(_.preCanonicalized)
    Canonicalize.reorderCommutativeOperators {
     withNewChildren(canonicalizedChildren)
    }
  }
}{quote}

> Bug in Canonicalization of expressions like Add & Multiply i.e Commutative 
> Operators
> ------------------------------------------------------------------------------------
>
>                 Key: SPARK-40362
>                 URL: https://issues.apache.org/jira/browse/SPARK-40362
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Asif
>            Priority: Major
>              Labels: spark-sql
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> In the canonicalization code which is now in two stages, canonicalization 
> involving Commutative operators is broken, if they are subexpressions of 
> certain type of expressions which override precanonicalize.
> Consider following expression:
> a + b > 10
> This GreaterThan expression when canonicalized as a whole for first time,  
> will skip the call to Canonicalize.reorderCommutativeOperators for the Add 
> expression as the GreaterThan's canonicalization used precanonicalize on 
> children ( the Add expression).
>  
> so if create a new expression 
> b + a > 10 and invoke canonicalize it, the canonicalized versions of these 
> two expressions will not match.
> The test 
> {quote}
> test("bug X") {
>     val tr1 = LocalRelation('c.int, 'b.string, 'a.int)
>    val y = tr1.where('a.attr + 'c.attr > 10).analyze
>    val fullCond = y.asInstanceOf[Filter].condition.clone()
>   val addExpr = (fullCond match {
>     case GreaterThan(x: Add, _) => x
>     case LessThan(_, x: Add) => x
>   }).clone().asInstanceOf[Add]
> val canonicalizedFullCond = fullCond.canonicalized
> // swap the operands of add
> val newAddExpr = Add(addExpr.right, addExpr.left)
> // build a new condition which is same as the previous one, but with operands 
> of //Add reversed 
> val builtCondnCanonicalized = GreaterThan(newAddExpr, 
> Literal(10)).canonicalized
> assertEquals(canonicalizedFullCond, builtCondnCanonicalized)
> }
> {quote}
> This test fails.
> The fix which I propose is that for the commutative expressions,  the 
> precanonicalize should be overridden and   
> Canonicalize.reorderCommutativeOperators be invoked on the expression instead 
> of at place of canonicalize.  effectively for commutative operands ( add, or 
> , multiply , and etc)  canonicalize and precanonicalize should be same.
> I will file a PR for it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to