peter-toth commented on code in PR #10832: URL: https://github.com/apache/datafusion/pull/10832#discussion_r1632225209
########## datafusion/optimizer/src/common_subexpr_eliminate.rs: ########## @@ -782,9 +786,11 @@ impl TreeNodeRewriter for CommonSubexprRewriter<'_> { return Ok(Transformed::new(expr, false, TreeNodeRecursion::Jump)); } - let (up_index, expr_id) = &self.id_array[self.down_index]; + let (up_index, expr_id, aliased) = &self.id_array[self.down_index]; self.down_index += 1; + self.aliased |= aliased; Review Comment: So I was refering to expression trees like: `(((a + b) - c) as "alias1") * (a + b)` I think currently in this case `a + b` gets extracted as a common expression and the original expression becomes: `(((common_1 as "a + b") - c) as "alias1") * (common_1 as "a + b")` (`common_1` is not the actual name of the exracted expression but let's use if for the sake of simplicity.) Now, instead of the above this PR could drop the first added alias (`as "a + b"`) becuase the first `a + b` has an alias ancestor node (`as "alias1"`), but it shouldn't drop the 2nd alias because the 2nd `a + b` doesn't have an alias ancestor. Using a flag (`self.aliased`) caluclated in `f_down()`, to decide if we can drop an alias won't work. This is because `f_down()` gets called on all nodes in the subtree of the first child (`((a + b) - c) as "alias1"`) of `*` before it gets called on the 2nd child (`a + b`). So when `f_down()` encounters the 2nd `a + b`, `self.aliased` is true and you miss adding a necessary alias. What I think this PR should do is maintain a stack of booleans and when `f_down()` is called push a new boolean item, and when `f_up()` is called pop one. A new boolean item should deppend on the previous item and if the current `expr` is `Alias`. (An alternative to maintaing a stack could be to use the `..._with_payload` APIs to propagate down some kind of "has alias ancestor" state, but I haven't finished adding that API yet: https://github.com/apache/datafusion/pull/8664. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org