I recommend the way GPDB does. Normalize the the logical plan expression in the preprocessing phase: - variable always left, constant always right for applicable binary operators; - for join conditions, left operand always comes from left relation, right operand always comes from right relation for reversable binary operators.
- Haisheng ------------------------------------------------------------------ 发件人:Enrico Olivelli<eolive...@gmail.com> 日 期:2019年12月30日 03:28:38 收件人:<dev@calcite.apache.org> 主 题:Re: [DISCUSS] CALCITE-2450 reorder predicates to a canonical form Il dom 29 dic 2019, 20:09 Vladimir Sitnikov <sitnikov.vladi...@gmail.com> ha scritto: > Hi, > > We have a 1-year old issue with an idea to sort RexNode operands so they > are consistent. > > For instance, "x=5" and "5=x" have the same semantics, so it would make > sense to stick to a single implementation. > A discussion can be found in > https://issues.apache.org/jira/browse/CALCITE-2450 > > We do not normalize RexNodes, thus it results in excessive planning time, > especially when the planner is trying to reorder joins. > For instance, it thinks Join(A, B, $0=$1) and Join(A, B, $1=$0) are > different joins, however, they are equivalent. > > The normalization does not seem to cost much, however, it enables me to > activate more rules (e.g. EnumerabeMergeRule), > so it is good as it enables to consider more sophisticated plans. > > I see two approaches: > a) Normalize in RexNode constructor. This seems easy to implement, however, > there's a catch > if someone assumed that the order of operands would be the same as the one > that was passed to the constructor. > I don't think there are such assumptions in the wild, but there might be. > The javadoc for the relevant methods says nothing regarding the operand > order. > However, the good thing would be RexNode would feel the same in the > debugger and in its toString representation. > > b) Normalize at RexCall#computeDigest only. > In other words, keep the operands unsorted, but make sure the digest is > created as if the operands were sorted. > This seems to be the most transparent change, however, it might surprise > that `toString` does not match to whatever is seen in the debugger. > > In any case, making `RexCall#toString` print sorted representation would > alter lots of tests. > For :core it is like 5540 tests completed, 358 failed, 91 skipped :(( > > WDYT? > I really would love this feature. Just my 2 cents Enrico > Hopefully, making the RexNode representation sorted would reduce the number > of `$1=$0` vs `$0=$1` plan diffs. > > Vladimir >