The methods that may help are RexUtil.toCnf and RexUtil.pullFactors. The latter is not as strict, therefore does not cause a combinatorial explosion in the size of the tree, so is more useful in practice. They are tested in RexProgramTest.testCnf and testPullFactors[1] .
These methods canonize at the term level (i.e. “a or b or a” becomes “a or b” but does not attempt to recognize terms that are equivalent). Neither of them exploit the symmetry of ‘=‘ (i.e. a = b iff b = a). It would make sense to normalize comparisons between field references & literals so that the lower field reference is always on the left. So, "$6 = $3" becomes “$3 = $6”; “$6 > $3” becomes “$3 < $6”. And “literal <= $5” becomes “$5 >= literal”. This would change a few test logs but would not damage performance, and would improve a few plans. Cab you please log a jira case with what you think would be useful even if you don’t intend to write a patch. Julian [1] https://github.com/apache/incubator-calcite/blob/master/core/src/test/java/org/apache/calcite/test/RexProgramTest.java#L562 On May 18, 2015, at 11:02 AM, Jesus Camachorodriguez < [email protected]> wrote: Julian, This is the JIRA case for pushing the expressions from filters down: https://issues.apache.org/jira/browse/HIVE-9069 <snip> Let me know if you think a Calcite method could be applied to simplify that predicate, or I could extend a method myself to cover those cases. Thanks, Jesús
