peter-toth commented on code in PR #10832:
URL: https://github.com/apache/datafusion/pull/10832#discussion_r1632225209
##########
datafusion/optimizer/src/common_subexpr_eliminate.rs:
##########
@@ -782,9 +786,11 @@ impl TreeNodeRewriter for CommonSubexprRewriter<'_> {
return Ok(Transformed::new(expr, false, TreeNodeRecursion::Jump));
}
- let (up_index, expr_id) = &self.id_array[self.down_index];
+ let (up_index, expr_id, aliased) = &self.id_array[self.down_index];
self.down_index += 1;
+ self.aliased |= aliased;
Review Comment:
So I was refering to expression trees like: `(((a + b) - c) as "alias1") *
(a + b)`
I think currently in this case `a + b` gets extracted as a common expression
and the original expression becomes:
`(((common_1 as "a + b") - c) as "alias1") * (common_1 as "a + b")`
(`common_1` is not the actual name of the exracted expression but let's use
if for the sake of simplicity.)
Now, instead of the above this PR could drop the first added alias (`as "a
+ b"`) becuase the first `a + b` has an alias ancestor node (`as "alias1"`),
but it shouldn't drop the 2nd alias because the 2nd `a + b` doesn't have an
alias ancestor.
Using a flag (`self.aliased`) caluclated in `f_down()`, to decide if we can
drop an alias won't work. This is because `f_down()` gets called on all nodes
in the subtree of the first child (`((a + b) - c) as "alias1"`) of `*` before
it gets called on the 2nd child (`a + b`). So when `f_down()` encounters the
2nd `a + b`, `self.aliased` is true and you miss adding a necessary alias.
What I think this PR should do is maintain a stack of booleans and when
`f_down()` is called push a new boolean item, and when `f_up()` is called pop
one. A new boolean item should deppend on the previous item and if the current
`expr` is `Alias`.
(An alternative to maintaing a stack could be to use the `..._with_payload`
APIs to propagate down some kind of "has alias ancestor" state, but I haven't
finished adding that API yet: https://github.com/apache/datafusion/pull/8664.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]