[ https://issues.apache.org/jira/browse/CALCITE-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242027#comment-17242027 ]
Julian Hyde commented on CALCITE-4419: -------------------------------------- A couple of minor things: * I'd rename 'testPosixRegex' to 'checkPosixRegex'. (Conventionally only test methods are prefixed with 'test'.) * Consider removing the {{boolean harmonize}} argument to {{PosixRegexMethodImplementor}} since it's always {{true}}. +1 when you've done those. Nice work. As a future change, consider introducing a Guava cache for compiled patterns. Compiling the pattern once per row is probably extremely slow. > Posix regex operators cannot be used within RelBuilder > ------------------------------------------------------ > > Key: CALCITE-4419 > URL: https://issues.apache.org/jira/browse/CALCITE-4419 > Project: Calcite > Issue Type: Bug > Components: core > Reporter: Ruben Q L > Assignee: Ruben Q L > Priority: Major > Labels: pull-request-available > Fix For: 1.27.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Posix regex operators are special, they are defined as binary ({{public class > SqlPosixRegexOperator extends SqlBinaryOperator}}), but their actual > implementation takes three arguments, the third being a caseSensitive boolean > flag ({{SqlFunctions#posixRegex(String s, String regex, Boolean > caseSensitive)}}). > Using them in SQL works fine. The magic happens in > {{SqlPosixRegexOperator#createCall}}, which adds an extra boolean parameter > (the caseSensitive flag) into the call: > {code:java} > @Override public SqlCall createCall(SqlLiteral functionQualifier, > SqlParserPos pos, SqlNode... operands) { > pos = pos.plusAll(Arrays.asList(operands)); > operands = Arrays.copyOf(operands, operands.length + 1); > operands[operands.length - 1] = SqlLiteral.createBoolean(caseSensitive, > SqlParserPos.ZERO); > return new SqlBasicCall(this, operands, pos, false, functionQualifier); > } > {code} > However, if we try to use them in a plan created by RelBuilder, problems > occur. > 1) The following plan: > {code:java} > RelBuilder builder = ... > builder > .scan("s", "emps") > .filter( > builder.call( > SqlStdOperatorTable.POSIX_REGEX_CASE_SENSITIVE, > builder.field("name"), > builder.literal("E..."))) > ... > {code} > Fails because it cannot find the implementor method > {noformat} > Caused by: java.lang.IllegalStateException: Unable to implement > EnumerableCalc(expr#0..4=[{inputs}], expr#5=['E...'], expr#6=[POSIX REGEX > CASE SENSITIVE($t2, $t5)]... > ... > Caused by: java.lang.NoSuchMethodException: > org.apache.calcite.runtime.SqlFunctions.posixRegex(java.lang.String, > java.lang.String) > {noformat} > 2) The solution seems either adding a (somehow redundant, but necessary due > to posix regex operators design) third boolean parameter > {code:java} > .filter( > builder.call( > SqlStdOperatorTable.POSIX_REGEX_CASE_SENSITIVE, > builder.field("name"), > builder.literal("E..."), > builder.literal(true))) > {code} > or maybe add this parameter transparently inside {{RelBuilder#call}} code: > {code} > private @Nonnull RexCall call(SqlOperator operator, List<RexNode> > operandList) { > switch (operator.getKind()) { > ... > case POSIX_REGEX_CASE_SENSITIVE: > if (operandList.size() == 2) { > return (RexCall) call(operator, operandList.get(0), > operandList.get(1), literal(true)); > } > break; > case POSIX_REGEX_CASE_INSENSITIVE: > if (operandList.size() == 2) { > return (RexCall) call(operator, operandList.get(0), > operandList.get(1), literal(false)); > } > break; > ... > {code} > Either way, it will now fail validating the operands: > {noformat} > wrong operand count 3 for POSIX REGEX CASE SENSITIVE > java.lang.AssertionError: wrong operand count 3 for POSIX REGEX CASE SENSITIVE > at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31) > at > org.apache.calcite.sql.SqlBinaryOperator.validRexOperands(SqlBinaryOperator.java:200) > {noformat} > 3) So, it would seem {{SqlPosixRegexOperator}} should override > {{validRexOperands}} to support 2 *and 3* parameters (which in fact would > seem correct and aligned to the current overridden behavior of > {{SqlPosixRegexOperator#getOperandCountRange}} and > {{SqlPosixRegexOperator#checkOperandTypes}}: > {code:java} > @Override public boolean validRexOperands(int count, Litmus litmus) { > if (count != 2 && count != 3) { > return litmus.fail("wrong operand count {} for {}", count, this); > } > return litmus.succeed(); > } > {code} > But now the test fails with almost the same message as the very first one: > {noformat} > Caused by: java.lang.IllegalStateException: Unable to implement > EnumerableCalc(expr#0..4=[{inputs}], expr#5=['E...'], expr#6=[true], > expr#7=[POSIX REGEX CASE SENSITIVE($t2, $t5, $t6)]... > ... > Caused by: java.lang.NoSuchMethodException: > org.apache.calcite.runtime.SqlFunctions.posixRegex(java.lang.String, > java.lang.String, boolean) > {noformat} > 4) The problem seems to be a mismatch between our third argument (boolean) > and the implementor's third argument (Boolean). Looking at the code, it seems > clear that a primitive boolean should / must be used since having a null > Boolean object makes no sense (and will cause a NPE anyway). So if we change > it: > {code:java} > // SqlFunctions.java > ... > public static boolean posixRegex(String s, String regex, boolean > caseSensitive) // instead of Boolean > { ... } > // --------- > // BuiltInMethod.java > ... > POSIX_REGEX(SqlFunctions.class, "posixRegex", String.class, String.class, > boolean.class), // instead of Boolean.class > {code} > Now our test finally succeeds. > Therefore, I propose to implement these sequence of changes in order to solve > this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)