Jefffrey commented on issue #8379: URL: https://github.com/apache/arrow-datafusion/issues/8379#issuecomment-1836817802
Actually I think I was off the mark on what `ExprId` is intended to do, it seems it would be more useful if there were a new LogicalExpr enum such as `AttributeReference`, which would refer to an expr from the parent plan by ExprId Like given a logical plan: ``` Projection: a.int_col, b.double_col, CAST(a.date_string_col AS Utf8) Inner Join: a.int_col = b.int_col SubqueryAlias: a Projection: alltypes_plain.int_col, alltypes_plain.date_string_col Filter: alltypes_plain.id > Int32(1) TableScan: alltypes_plain projection=[id, int_col, date_string_col], partial_filters=[alltypes_plain.id > Int32(1)] SubqueryAlias: b Projection: alltypes_plain.int_col, alltypes_plain.double_col Filter: CAST(alltypes_plain.tinyint_col AS Float64) < alltypes_plain.double_col TableScan: alltypes_plain projection=[tinyint_col, int_col, double_col], partial_filters=[CAST(alltypes_plain.tinyint_col AS Float64) < alltypes_plain.double_col] ``` That top level projection has `a.int_col` as a `Column` for example, which when turned into physical plan needs to search the parent schema by name https://github.com/apache/arrow-datafusion/blob/a6e6d3fab083839239ef81cf3a3546dd8929a541/datafusion/core/src/physical_planner.rs#L879-L891 Whereas with exprid's, it could be possible for `a.int_col` to be an AttributeReference which references the parent expr list to point to which expr it references by id. And I think each new expr would have a new ID. Honestly I could be way off the mark here on the usages/benefits of exprid 😅 It's just something I was thinking about, especially in relation to how verbose it can be to check if columns are the same when taking into account table, schema and catalog parts of the identifier for a column - See troubles with ambiguity check here https://github.com/apache/arrow-datafusion/issues/6012 So instead of having to find the original column of a projected column in a logical plan via name during logical optimization/physical planning, could have that done once off in an analyzer rule pass then afterwards use exprids -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org