[ https://issues.apache.org/jira/browse/CALCITE-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Leonard Xu updated CALCITE-3368: -------------------------------- Description: I found that Calcite's simplification logic of isNull expression will convert from *"f(operand0, operand1) IS NULL"* to *"operand0 IS NULL OR operand1 IS NULL"* if the Policy of RexNode‘s SqlKind is ANY When I do some overflow tests 。 This simplification will not calculate the real vale of *f(operand0, operand1)* (eg..'f0 + 'f1 )which maybe overflow. {code:java} //org.apache.calcite.rex.RexSimplify.java private RexNode simplifyIsNull(RexNode a) { // Simplify the argument first, // call ourselves recursively to see whether we can make more progress. // For example, given // "(CASE WHEN FALSE THEN 1 ELSE 2) IS NULL" we first simplify the // argument to "2", and only then we can simplify "2 IS NULL" to "FALSE". a = simplify(a, UNKNOWN); if (!a.getType().isNullable() && isSafeExpression(a)) { return rexBuilder.makeLiteral(false); } if (RexUtil.isNull(a)) { return rexBuilder.makeLiteral(true); } if (a.getKind() == SqlKind.CAST) { return null; } switch (Strong.policy(a.getKind())) { case NOT_NULL: return rexBuilder.makeLiteral(false); case ANY: // "f" is a strong operator, so "f(operand0, operand1) IS NULL" simplifies // to "operand0 IS NULL OR operand1 IS NULL" final List<RexNode> operands = new ArrayList<>(); for (RexNode operand : ((RexCall) a).getOperands()) { final RexNode simplified = simplifyIsNull(operand); if (simplified == null) { operands.add( rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL, operand)); } else { operands.add(simplified); } } return RexUtil.composeDisjunction(rexBuilder, operands, false); case AS_IS: default: return null; } }{code} And most of calculating SqlKinds are assigned *Policy.ANY* at present. {code:java} //org.apache.calcite.plan.Strong.java public static Policy policy(SqlKind kind) { return MAP.getOrDefault(kind, Policy.AS_IS); } .... map.put(SqlKind.PLUS, Policy.ANY); map.put(SqlKind.PLUS_PREFIX, Policy.ANY); map.put(SqlKind.MINUS, Policy.ANY); map.put(SqlKind.MINUS_PREFIX, Policy.ANY); map.put(SqlKind.TIMES, Policy.ANY); map.put(SqlKind.DIVIDE, Policy.ANY); * that operator evaluates to null. */ public enum Policy { /** This kind of expression is never null. No need to look at its arguments, * if it has any. */ NOT_NULL, /** This kind of expression has its own particular rules about whether it * is null. */ CUSTOM, /** This kind of expression is null if and only if at least one of its * arguments is null. */ ANY, /** This kind of expression may be null. There is no way to rewrite. */ AS_IS, }{code} It should be an obvious nonequivalent simplification. And this issue come from Flink (FLINK-14030). [~danny0405], Could you have a look at this? was: When I do some overflow test ,I found that Calcite's simplification logic of isNull expression will convert from *"f(operand0, operand1) IS NULL"* to *"operand0 IS NULL OR operand1 IS NULL"* if the Policy of RexNode‘s SqlKind is ANY 。 This simplification will not calculate the real vale of *f(operand0, operand1)* (eg..'f0 + 'f1 )which maybe overflow. {code:java} //org.apache.calcite.rex.RexSimplify.java private RexNode simplifyIsNull(RexNode a) { // Simplify the argument first, // call ourselves recursively to see whether we can make more progress. // For example, given // "(CASE WHEN FALSE THEN 1 ELSE 2) IS NULL" we first simplify the // argument to "2", and only then we can simplify "2 IS NULL" to "FALSE". a = simplify(a, UNKNOWN); if (!a.getType().isNullable() && isSafeExpression(a)) { return rexBuilder.makeLiteral(false); } if (RexUtil.isNull(a)) { return rexBuilder.makeLiteral(true); } if (a.getKind() == SqlKind.CAST) { return null; } switch (Strong.policy(a.getKind())) { case NOT_NULL: return rexBuilder.makeLiteral(false); case ANY: // "f" is a strong operator, so "f(operand0, operand1) IS NULL" simplifies // to "operand0 IS NULL OR operand1 IS NULL" final List<RexNode> operands = new ArrayList<>(); for (RexNode operand : ((RexCall) a).getOperands()) { final RexNode simplified = simplifyIsNull(operand); if (simplified == null) { operands.add( rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL, operand)); } else { operands.add(simplified); } } return RexUtil.composeDisjunction(rexBuilder, operands, false); case AS_IS: default: return null; } }{code} And most of calculating SqlKinds are assigned *Policy.ANY* at present. {code:java} //org.apache.calcite.plan.Strong.java public static Policy policy(SqlKind kind) { return MAP.getOrDefault(kind, Policy.AS_IS); } .... map.put(SqlKind.PLUS, Policy.ANY); map.put(SqlKind.PLUS_PREFIX, Policy.ANY); map.put(SqlKind.MINUS, Policy.ANY); map.put(SqlKind.MINUS_PREFIX, Policy.ANY); map.put(SqlKind.TIMES, Policy.ANY); map.put(SqlKind.DIVIDE, Policy.ANY); * that operator evaluates to null. */ public enum Policy { /** This kind of expression is never null. No need to look at its arguments, * if it has any. */ NOT_NULL, /** This kind of expression has its own particular rules about whether it * is null. */ CUSTOM, /** This kind of expression is null if and only if at least one of its * arguments is null. */ ANY, /** This kind of expression may be null. There is no way to rewrite. */ AS_IS, }{code} It should be an obvious nonequivalent simplification. And this issue come from Flink ([FLINK-14030|https://issues.apache.org/jira/browse/FLINK-14030]). [~danny0405] could you have a look at this? > Nonequivalent simplification happens in RexSimplify > --------------------------------------------------- > > Key: CALCITE-3368 > URL: https://issues.apache.org/jira/browse/CALCITE-3368 > Project: Calcite > Issue Type: Bug > Components: core > Affects Versions: 1.21.0 > Reporter: Leonard Xu > Priority: Major > Fix For: 1.22.0 > > > I found that Calcite's simplification logic of isNull expression will convert > from > *"f(operand0, operand1) IS NULL"* to > *"operand0 IS NULL OR operand1 IS NULL"* if the Policy of RexNode‘s SqlKind > is ANY > When I do some overflow tests 。 > This simplification will not calculate the real vale of *f(operand0, > operand1)* (eg..'f0 + 'f1 )which maybe overflow. > {code:java} > //org.apache.calcite.rex.RexSimplify.java > private RexNode simplifyIsNull(RexNode a) { > // Simplify the argument first, > // call ourselves recursively to see whether we can make more progress. > // For example, given > // "(CASE WHEN FALSE THEN 1 ELSE 2) IS NULL" we first simplify the > // argument to "2", and only then we can simplify "2 IS NULL" to "FALSE". > a = simplify(a, UNKNOWN); > if (!a.getType().isNullable() && isSafeExpression(a)) { > return rexBuilder.makeLiteral(false); > } > if (RexUtil.isNull(a)) { > return rexBuilder.makeLiteral(true); > } > if (a.getKind() == SqlKind.CAST) { > return null; > } > switch (Strong.policy(a.getKind())) { > case NOT_NULL: > return rexBuilder.makeLiteral(false); > case ANY: > // "f" is a strong operator, so "f(operand0, operand1) IS NULL" simplifies > // to "operand0 IS NULL OR operand1 IS NULL" > final List<RexNode> operands = new ArrayList<>(); > for (RexNode operand : ((RexCall) a).getOperands()) { > final RexNode simplified = simplifyIsNull(operand); > if (simplified == null) { > operands.add( > rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL, operand)); > } else { > operands.add(simplified); > } > } > return RexUtil.composeDisjunction(rexBuilder, operands, false); > case AS_IS: > default: > return null; > } > }{code} > And most of calculating SqlKinds are assigned *Policy.ANY* at present. > {code:java} > //org.apache.calcite.plan.Strong.java > public static Policy policy(SqlKind kind) { > return MAP.getOrDefault(kind, Policy.AS_IS); > } > .... > map.put(SqlKind.PLUS, Policy.ANY); > map.put(SqlKind.PLUS_PREFIX, Policy.ANY); > map.put(SqlKind.MINUS, Policy.ANY); > map.put(SqlKind.MINUS_PREFIX, Policy.ANY); > map.put(SqlKind.TIMES, Policy.ANY); > map.put(SqlKind.DIVIDE, Policy.ANY); > * that operator evaluates to null. */ > public enum Policy { > /** This kind of expression is never null. No need to look at its arguments, > * if it has any. */ > NOT_NULL, > /** This kind of expression has its own particular rules about whether it > * is null. */ > CUSTOM, > /** This kind of expression is null if and only if at least one of its > * arguments is null. */ > ANY, > /** This kind of expression may be null. There is no way to rewrite. */ > AS_IS, > }{code} > > It should be an obvious nonequivalent simplification. > And this issue come from Flink (FLINK-14030). > > [~danny0405], Could you have a look at this? > -- This message was sent by Atlassian Jira (v8.3.4#803005)