alamb opened a new issue #98: URL: https://github.com/apache/arrow-datafusion/issues/98
*Note*: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-9770 The high level idea is that if an expression can be partially evaluated during planning time then # The execution time will be increased # There may be additional optimizations possible (like removing entire LogicalPlan nodes, for example) I recently saw the following selection expression created (by the [predicate push down|https://github.com/apache/arrow/pull/7880]) {code} Selection: #a Eq Int64(1) And #b GtEq Int64(1) And #a LtEq Int64(1) And #a Eq Int64(1) And #b GtEq Int64(1) And #a LtEq Int64(1) TableScan: test projection=None {code} This could be simplified significantly: 1. Duplicate clauses could be removed (e.g. `#a Eq Int64(1) And #a Eq Int64(1)` --> `#a Eq Int64(1)`) 2. Algebraic simplification (e.g. if `A<=B and A=5`, is the same as `A=5`) Inspiration can be taken from the postgres code that evaluates constant expressions https://doxygen.postgresql.org/clauses_8c.html#ac91c4055a7eb3aa6f1bc104479464b28 (in this case, for example if you have a predicate A=5 then you can basically substitute in A=5 for any expression higher up in the the plan). Other classic optimizations include things such as `A OR TRUE` --> `A`, `A AND TRUE` --> TRUE, etc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org