spapin commented on code in PR #18587:
URL: https://github.com/apache/pinot/pull/18587#discussion_r3498585715
##########
pinot-core/src/main/java/org/apache/pinot/core/query/optimizer/filter/IdenticalPredicateFilterOptimizer.java:
##########
@@ -43,62 +61,45 @@ boolean canBeOptimized(Expression filterExpression,
@Nullable Schema schema) {
@Override
Expression optimizeChild(Expression filterExpression, @Nullable Schema
schema) {
Function function = filterExpression.getFunctionCall();
- FilterKind kind = FilterKind.valueOf(function.getOperator());
- switch (kind) {
- case EQUALS:
- if (hasIdenticalLhsAndRhs(function.getOperands())) {
- return TRUE;
- }
- break;
- case NOT_EQUALS:
- if (hasIdenticalLhsAndRhs(function.getOperands())) {
- return FALSE;
- }
- break;
- default:
- break;
+ if (FilterKind.valueOf(function.getOperator()) == FilterKind.EQUALS) {
+ Optional<Boolean> folded =
foldIdenticalComparisonWithTrueLiteral(function.getOperands());
+ if (folded.isPresent()) {
+ return getExpressionFromBoolean(folded.get());
+ }
}
return filterExpression;
}
/**
- * Pinot queries of the WHERE 1 != 1 AND "col1" = "col2" variety are
rewritten as
- * 1-1 != 0 AND "col1"-"col2" = 0. Therefore, we check specifically for the
case where
- * the operand is set up in this fashion.
- *
- * We return false specifically after every check to ensure we're only
continuing when
- * the input looks as expected. Otherwise, it's easy to for one of the
operand functions
- * to return null and fail the query.
- *
- * TODO: The rewrite is already happening in
PredicateComparisonRewriter.updateFunctionExpression(),
- * so we might just compare the lhs and rhs there.
+ * Folds a predicate of the form 'comparison(a, a) = true' — a comparison
whose two operands are the <em>same</em>
+ * expression — to the constant value it must always have. Returns an empty
{@link Optional} when the predicate is
+ * not such a comparison: the operands differ, the right-hand side is not
the literal {@code true}, or the function
+ * is a non-comparison like 'startsWith(a, a) = true' whose value cannot be
determined here.
*/
- private boolean hasIdenticalLhsAndRhs(List<Expression> operands) {
- boolean hasTwoChildren = operands.size() == 2;
- Expression firstChild = operands.get(0);
- if (firstChild.getFunctionCall() == null || !hasTwoChildren) {
- return false;
+ private Optional<Boolean>
foldIdenticalComparisonWithTrueLiteral(List<Expression> operands) {
Review Comment:
Good point. I think if this optimization becomes necessary, we can first ask
users to rewrite `a = a` or re-add the optimization and be specific this is for
performance, not correctness. So I'd leave as is.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]