With Zoltan’s help, I just committed [1] a nice change to RexSimplify, adding a 
“paranoid” flag. If you set paranoid = true, then RexSimplify will simplify 
expressions but it will also try to prove that the transformation is wrong. It 
does that by assigning sample values to variables and checking that the 
simplified expression produces the same result as the original expression.

Consider the expression

  (not a) and (b > 5) and (not true)

This simplifies to “false”, because of the trailing “not true”. “a” is boolean, 
so we use sample values {unknown, false, true}; “b” is an integer, so we use 
values {null, 0, 1, 1,000,000}. For the 15 (3 x 5) combinations of assignments 
to variables (a, b), we evaluate the original expression and the simplified 
expression, and make sure they give the same result. If the result is 
different, we have proved that the simplification is invalid. I already found 
one bug [2] this way.

I added RexInterpreter (which evaluates RexNode expressions by interpreting, 
rather than by generating Java as RexExecutor does) because I didn’t want to 
increase coupling.

One area this could be improved is to add more values based on the expression. 
Because the integer literal 5 is present, we ought to check values {4, 5, 6} in 
addition to the default values {null, -1, 0, 1, 1,000,000}.

It’s really important that our transformation rules can be trusted, so please 
use this functionality as much as you can in tests. If you your transformation 
does not use RexSimplify, you can use RexAnalyzer directly. Also, add more 
tests for rex transformations; each test gives RexSimplifier a lot of 
opportunities to find faults.

Julian

[1] https://issues.apache.org/jira/browse/CALCITE-2314 
<https://issues.apache.org/jira/browse/CALCITE-2314>

[2] https://issues.apache.org/jira/browse/CALCITE-2332 
<https://issues.apache.org/jira/browse/CALCITE-2332> 

Reply via email to