[ https://issues.apache.org/jira/browse/SPARK-40362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17601067#comment-17601067 ]
Asif commented on SPARK-40362: ------------------------------ will be generating a PR tomorrow.. > Bug in Canonicalization of expressions like Add & Multiply i.e Commutative > Operators > ------------------------------------------------------------------------------------ > > Key: SPARK-40362 > URL: https://issues.apache.org/jira/browse/SPARK-40362 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.3.0 > Reporter: Asif > Priority: Major > Labels: spark-sql > Original Estimate: 72h > Remaining Estimate: 72h > > In the canonicalization code which is now in two stages, canonicalization > involving Commutative operators is broken, if they are subexpressions of > certain type of expressions which override precanonicalize. > Consider following expression: > a + b > 10 > This GreaterThan expression when canonicalized as a whole for first time, > will skip the call to Canonicalize.reorderCommutativeOperators for the Add > expression as the GreaterThan's canonicalization used precanonicalize on > children ( the Add expression). > > so if create a new expression > b + a > 10 and invoke canonicalize it, the canonicalized versions of these > two expressions will not match. > The test > {quote} > test("bug X") { > val tr1 = LocalRelation('c.int, 'b.string, 'a.int) > val y = tr1.where('a.attr + 'c.attr > 10).analyze > val fullCond = y.asInstanceOf[Filter].condition.clone() > val addExpr = (fullCond match { > case GreaterThan(x: Add, _) => x > case LessThan(_, x: Add) => x > }).clone().asInstanceOf[Add] > val canonicalizedFullCond = fullCond.canonicalized > // swap the operands of add > val newAddExpr = Add(addExpr.right, addExpr.left) > // build a new condition which is same as the previous one, but with operands > of //Add reversed > val builtCondnCanonicalized = GreaterThan(newAddExpr, > Literal(10)).canonicalized > assertEquals(canonicalizedFullCond, builtCondnCanonicalized) > } > {quote} > This test fails. > The fix which I propose is that for the commutative expressions, the > precanonicalize should be overridden and > Canonicalize.reorderCommutativeOperators be invoked on the expression instead > of at place of canonicalize. effectively for commutative operands ( add, or > , multiply , and etc) canonicalize and precanonicalize should be same. > I will file a PR for it -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org