[ 
https://issues.apache.org/jira/browse/CALCITE-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L reassigned CALCITE-7635:
----------------------------------

    Assignee: Ruben Q L

> Simplification result of conjunction of comparisons depends on terms order
> --------------------------------------------------------------------------
>
>                 Key: CALCITE-7635
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7635
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.42.0
>            Reporter: Ruben Q L
>            Assignee: Ruben Q L
>            Priority: Major
>
> Calling RexSimplify with unknown=FALSE on this expression:
> {noformat}
> x < 10 AND x > 0 AND x < 20
> i.e.
> AND( <(x, 10), >(x, 0), <(x, 20) )
> {noformat}
> Should return {{x < 10 AND x > 0}}
> And this happens if we run the following test inside RexProgramTest 
> (simplification result is returned on SEARCH format):
> {code:java}
>   @Test void testSimplifyAndComparison() {
>     checkSimplifyFilter(
>         and(
>             gt(vInt(), literal(0)),
>             lt(vInt(), literal(10)),
>             lt(vInt(), literal(20))),
>         "SEARCH(?0.int0, Sarg[(0..10)])"); // runs OK
>   }
> {code}
> However, if we alter the order of the terms, and run instead:
> {code:java}
>   @Test void testSimplifyAndComparison() {
>     checkSimplifyFilter(
>         and(
>             lt(vInt(), literal(10)),
>             gt(vInt(), literal(0)),
>             lt(vInt(), literal(20))),
>         "SEARCH(?0.int0, Sarg[(0..10)])"); // fails
>   }
> {code}
> If fails with:
> {noformat}
> Expected: with toString() is "SEARCH(?0.int0, Sarg[(0..10)])"
>      but: toString() was "SEARCH(?0.int0, Sarg[(0..10); NULL AS FALSE])"
> {noformat}
> The problem is that, in the second case, the simplification result (before 
> being converted into a SEARCH) is not {{x < 10 AND x > 0}} , but {{x < 10 AND 
> x > 0 AND X IS NOT NULL}}; this extra IS NOT NULL turns the final SEARCH 
> expression into "NULL AS FALSE" (instead of the default "NULL AS UNKNOWN" 
> that we got on the first case). This "NULL AS FALSE" will make that, if this 
> SEARCH expression gets later expanded, we'll get again {{x < 10 AND x > 0 AND 
> X IS NOT NULL}}; instead of the expected {{x < 10 AND x > 0}};
> The problem originates on {{RexSimplify#simplifyAnd}}, with unknownAs == 
> FALSE and predicateElimination (be default true), we get into 
> {{simplifyAndTerms}}:
> {code:java}
> RexNode simplifyAnd(RexCall e, RexUnknownAs unknownAs) {
>     List<RexNode> operands = RelOptUtil.conjunctions(e);
>     if (unknownAs == FALSE && predicateElimination) {
>       simplifyAndTerms(operands, FALSE);
>     }
>     ...
> {code}
> Inside this method, each term is simplified using the previous terms as 
> "predicates" into the RexSimplify:
> {code:java}
> private void simplifyAndTerms(List<RexNode> terms, RexUnknownAs unknownAs) {
>     RexSimplify simplify = this;
>     for (int i = 0; i < terms.size(); i++) {
>       RexNode t = terms.get(i);
>       if (Predicate.of(t) == null) {
>         continue;
>       }
>       terms.set(i, simplify.simplify(t, unknownAs)); // simplifies using 
> previous terms as predicates
>       RelOptPredicateList newPredicates =
>           simplify.predicates.union(rexBuilder,
>               RelOptPredicateList.of(rexBuilder, terms.subList(i, i + 1)));
>       simplify = simplify.withPredicates(newPredicates);
>     }
>     ...
> {code}
> When simplifying each term (using the previous ones as predicates), since 
> they are comparisons, we'll get inside RexSimplify#simplifyComparison, which 
> calls at the end {{simplifyUsingPredicates}}:
> {code:java}
> private <C extends Comparable<C>> RexNode simplifyComparison(RexCall e,
>       RexUnknownAs unknownAs, Class<C> clazz) {
>     ...
>     return simplifyUsingPredicates(e2, clazz);
>   }
> {code}
> And {{simplifyUsingPredicates}} calls {{residue}} method, which can return 
> {{RangeSets.rangeSetAll()}} (which results in returning IS NOT NULL, because 
> "nullability might be problematic"), but only depending on the order in which 
> the terms have been processed:
> {code:java}
>   private <C extends Comparable<C>> RexNode simplifyUsingPredicates(RexNode e,
>       Class<C> clazz) {
>     ...
>     final C v0 = comparison.literal.getValueAs(clazz);
>     ...
>     final RangeSet<C> rangeSet = rangeSet(comparison.kind, v0);
>     final RangeSet<C> rangeSet2 =
>         residue(comparison.ref, rangeSet, predicates.pulledUpPredicates,
>             clazz);
>     if (rangeSet2.isEmpty()) {
>       // Term is impossible to satisfy given these predicates
>       return rexBuilder.makeLiteral(false);
>     } else if (rangeSet2.equals(rangeSet)) {
>       // no change
>       return e;
>     } else if (rangeSet2.equals(RangeSets.rangeSetAll())) {
>       // Range is always satisfied given these predicates; but nullability 
> might
>       // be problematic
>       return simplify(
>           rexBuilder.makeCall(RexUtil.getPos(e), 
> SqlStdOperatorTable.IS_NOT_NULL, comparison.ref),
>           RexUnknownAs.UNKNOWN); // [1]
>     ...
>     } else {
>       // range has been reduced but it's not worth simplifying
>       return e; // [2]
>     }
> {code}
> For example: 
> - Processing x<10, x>0, x<20; will return in X is NOT NULL (via [1]) when 
> x<20 is processed here
> - Processing x<10, x<20, x>0; will return in X is NOT NULL (via [1]) when 
> x<20 is processed here
> - Processing x>0, x<10, x<20; will return x<20 (via [2]) when x<20 is 
> processed here 
> - Processing x>0, x<20, x<10; will return x<20 (via [2]) when x<20 is 
> processed here (and x<10 when x<10 is processed) 
> - Processing x<20, x>0, x<10; will return x<20 (via [2]) when x<20 is 
> processed here (and x<10 when x<10 is processed)
> Basically, the problem is: when the expression x<20 (i.e. range (-INF, 20)) 
> with the list of predicates [x<10, x>0] i.e. [(-INF, 10) , (0, INF)], is 
> processed and, as in this example, the list of previous predicates contains 
> as first item a predicate enclosed by it [3], in this case x<10 (i.e. 
> range(-INF, 10)), {{residue}} will consider intermediate result as 
> {{RangeSets.rangeSetAll()}} i.e. (-INF, +INF), which will enclose any other 
> predicate on subsequent iterations, so this will be the result to be returned:
> {code:java}
> private static <C extends Comparable<C>> RangeSet<C> residue(RexNode ref,
>       RangeSet<C> r0, List<RexNode> predicates, Class<C> clazz) {
>     RangeSet<C> result = r0;
>     for (RexNode predicate : predicates) {
>       ...
>         final RexCall call = (RexCall) predicate;
>         final Comparison comparison = Comparison.of(call);
>         if (comparison != null && comparison.ref.equals(ref)) {
>           final C c1 = comparison.literal.getValueAs(clazz);
>           ...
>             final Range<C> r1 = range(comparison.kind, c1);
>             if (result.encloses(r1)) {
>               // Given these predicates, term is always satisfied.
>               // e.g. r0 is "$0 < 10", r1 is "$0 < 5"
>               result = RangeSets.rangeSetAll(); [3]
>               continue;
>             }
>             result = result.subRangeSet(r1); [4]
>       ...
>     }
>     return result;
>   }
> {code}
> Notice that, in case we process x<20 (i.e. range (-INF, 20)), but the list of 
> predicates is not [(-INF, 10), (0, INF)] , but instead the other way around 
> [(0, INF), (-INF, 10)]; then result will not be (-INF, +INF), but (0,10), 
> since it will be accumulated as (0, 20) via [4] on the first iteration (when 
> processing (-INF, 20) with first predicate (0, INF)); and then (0, 10) on the 
> second iteration, again via [4] when processing this intermediate result (0, 
> 20) with the second predicate (-INF, 10).
> To sum up, the fact that RexSimplify#residue result can differ depending on 
> the order of its list of predicates parameter, leads to a global different 
> result on the initial RexSimplify#simplify call.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to