[ https://issues.apache.org/jira/browse/CALCITE-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397709#comment-17397709 ]
Julian Hyde commented on CALCITE-4725: -------------------------------------- I see the pull request doesn't have any tests. The above description pastes large chunks of implementation code. That's not how I would approach this (following "test-driven development"). I'm not interested in the implementation. I would like to see a discussion of current and future tests. E.g. 'the validator currently regards "intColumn BETWEEN stringColumn AND booleanColumn" as valid, but I think it should give the error "xxx"'. There are several other cases. For example, we should discuss whether "dateColumn BETWEEN stringColumn AND otherDateColumn" should be valid, given that strings can be implicitly converted to dates. Likewise, discussion about conversion to relational algebra is not very interesting. If "x BETWEEN y AND z" is valid then I would expect it to be converted as if the user had written "x >= y AND x <= z". (Correct me if I'm wrong.) Bottom line: I'd expect the PR to have several new tests in SqlValidatorTest, and one or two new query execution tests in {{misc.iq}}. > Between clause operands checker should check all combinations of the three > operands > ----------------------------------------------------------------------------------- > > Key: CALCITE-4725 > URL: https://issues.apache.org/jira/browse/CALCITE-4725 > Project: Calcite > Issue Type: Bug > Components: core > Affects Versions: 1.27.0 > Reporter: Yao Zhang > Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Between clause operands checker should check all combinations of the three > operands > Given those SQLs: > {code:sql} > create table student (id bigint, name string, score DECIMAL) > select score from student where score between '1.0' and (id < 1) > {code} > As the correct implementation of Calcite, the validation process will success > because its operands type checking logic is as discussed below: > Three operands of between clause are student.score(with type DECIMAL), '1.0' > (with type CHAR[]) and (id < 1) (an expression with the type of Boolean). > Calcite will check whether those three operands are comparable. It does two > checks: student.score with '1.0' and '1.0' with (id < 1). > We look into SqlTypeUtil.isComparable: > {code:java} > public static boolean isComparable(RelDataType type1, RelDataType type2) { > if (type1.isStruct() != type2.isStruct()) { > return false; > } > if (type1.isStruct()) { > int n = type1.getFieldCount(); > if (n != type2.getFieldCount()) { > return false; > } > for (Pair<RelDataTypeField, RelDataTypeField> pair > : Pair.zip(type1.getFieldList(), type2.getFieldList())) { > if (!isComparable(pair.left.getType(), pair.right.getType())) { > return false; > } > } > return true; > } > final RelDataTypeFamily family1 = family(type1); > final RelDataTypeFamily family2 = family(type2); > if (family1 == family2) { > return true; > } > // If one of the arguments is of type 'ANY', return true. > if (family1 == SqlTypeFamily.ANY > || family2 == SqlTypeFamily.ANY) { > return true; > } > // If one of the arguments is of type 'NULL', return true. > if (family1 == SqlTypeFamily.NULL > || family2 == SqlTypeFamily.NULL) { > return true; > } > // We can implicitly convert from character to date > if (family1 == SqlTypeFamily.CHARACTER > && canConvertStringInCompare(family2) > || family2 == SqlTypeFamily.CHARACTER > && canConvertStringInCompare(family1)) { > return true; > } > return false; > } > {code} > and canConvertStringInCompare: > {code:java} > private static boolean canConvertStringInCompare(RelDataTypeFamily > family) { > if (family instanceof SqlTypeFamily) { > SqlTypeFamily sqlTypeFamily = (SqlTypeFamily)family; > switch(sqlTypeFamily) { > case DATE: > case TIME: > case TIMESTAMP: > case INTERVAL_DAY_TIME: > case INTERVAL_YEAR_MONTH: > case NUMERIC: > case APPROXIMATE_NUMERIC: > case EXACT_NUMERIC: > case INTEGER: > case BOOLEAN: > return true; > } > } > return false; > } > {code} > We can see that DECIMAL(actually is NUMERIC) is comparable with CHAR, and > CHAR is also comparable with BOOLEAN. > But in conversion stage, it will result in an exception. I will explain below. > The type of the three operands in where clause is DECIMAL, CHAR and BOOLEAN > respectively. We cannot get a consistent type for those three operands. As a > result, Calcite will add no cast functions to those operands. So no auto type > conversion is performed. > Then, the between clause will be converted to: > {code:sql} > select score from student where score >= '1.0' and score <= (id < 1) > {code} > Obviously an integer is not comparable with a Boolean. The execution engine > will throw ex exception. This is where the problem lies. > If we change the SQL to: > {code:sql} > select score from student where score >= 1.0 and score <= (id < 1) > {code} > Calcite can correctly detect the error in validation stage that numeric type > is not comparable with Boolean type. > To correct this problem, I suggest for between operand type checker, it > should perform more strict type check process by checking all combinations of > those operands. -- This message was sent by Atlassian Jira (v8.3.4#803005)