[jira] [Created] (CALCITE-6508) Parse error when using scalar sub-query as operant to Array constructor function
Ian Bertolacci created CALCITE-6508: --- Summary: Parse error when using scalar sub-query as operant to Array constructor function Key: CALCITE-6508 URL: https://issues.apache.org/jira/browse/CALCITE-6508 Project: Calcite Issue Type: Bug Reporter: Ian Bertolacci Using a scalar subquery in the {{`ARRAY[ ... ]`}} function throws "parse failed: Query expression encountered in illegal context" (Calcite version 1.36) For example: {code:sql} select ARRAY[ (select 1) ] select ARRAY[ (select max(x) from table) ] {code} Is this something that can be configured or easily altered? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6507) Random functions are incorrectly considered deterministic
Ruben Q L created CALCITE-6507: -- Summary: Random functions are incorrectly considered deterministic Key: CALCITE-6507 URL: https://issues.apache.org/jira/browse/CALCITE-6507 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Ruben Q L Assignee: Ruben Q L Fix For: 1.38.0 {{{}RAND{}}}, {{{}RANDOM{}}}, and {{RAND_INTEGER}} don't override the {{isDeterministic}} method, so they get the default behavior, i.e. true, which is incorrect. This can lead to undesired consequences, e.g.: A) {{RelMetadataQuery#getPulledUpPredicates}} can consider them as constants (and place them inside {{{}RelOptPredicateList#constantMap{}}}) since {{RelMdPredicates}} calls {{{}RexUtil#isConstant{}}}, which uses a {{ConstantFinder}} that evaluates whether a RexCall is constant or not based on the operator's isDeterministic value. B) As a consequence of A) {{SortRemoveConstantKeyRule}} can incorrectly remove an "ORDER BY RAND()" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6506) Incorrect RelDataType generated for IN Subquery
Brandon Chong created CALCITE-6506: -- Summary: Incorrect RelDataType generated for IN Subquery Key: CALCITE-6506 URL: https://issues.apache.org/jira/browse/CALCITE-6506 Project: Calcite Issue Type: Bug Reporter: Brandon Chong If a user writes a query like: {{select * from cp."iceberg/orders/orders.parquet" where o_orderkey IN (1, 2, 3)}} It get's rewritten to use a LogicalValues with RelDataType nullable INTEGER row type. The problem is that it's actually NOT NULL. You can see the issue in SqlToRelConverter: {{final RelDataType targetRowType = SqlTypeUtil.promoteToRowType(typeFactory, validator.getValidatedNodeType(leftKeyNode), null); final boolean notIn = call.getOperator().kind == SqlKind.NOT_IN; converted = convertExists(query, RelOptUtil.SubQueryType.IN, subQuery.logic, notIn, targetRowType);}} Note that it's using the type of the leftKeyNode and not right key node. For example: A IN (B, C, D) ... it's using typeof(A) instead of typeof(B, C, D). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6505) Redshift JSON_OBJECT incorrectly rewritten
Kenneth Stott created CALCITE-6505: -- Summary: Redshift JSON_OBJECT incorrectly rewritten Key: CALCITE-6505 URL: https://issues.apache.org/jira/browse/CALCITE-6505 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Environment: AWS cloud Reporter: Kenneth Stott Redshift has no JSON_OBJECT function. This is currently being rewritten as JSON_OBJECT function with the KEY/VALUE syntax. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6504) JOIN_SUB_QUERY_TO_CORRELATE/Join SubQueryRemoveRule produces incorrect tree when using correlated sub-query in on clause of equi-join
Ian Bertolacci created CALCITE-6504: --- Summary: JOIN_SUB_QUERY_TO_CORRELATE/Join SubQueryRemoveRule produces incorrect tree when using correlated sub-query in on clause of equi-join Key: CALCITE-6504 URL: https://issues.apache.org/jira/browse/CALCITE-6504 Project: Calcite Issue Type: Bug Reporter: Ian Bertolacci JOIN_SUB_QUERY_TO_CORRELATE/Join SubQueryRemoveRule produces incorrect trees when using a correlated sub-query as part of an equi-join condition of a join. For example: {code:sql} select * from T3 join T2 on (select max(id) from T1 where T3.C301 = T1.id) = T2.C201 {code} Has the initial tree: {code:none} 20:Project(ID=[$0], C301=[$1], C302=[$2], ID0=[$3], C201=[$4], C202=[$5], C203=[$6], C204=[$7]) └── 19:Join(condition=[=(SCALAR_SUBQUERY{ │18:Aggregate(group=[{}], EXPR$0=[MAX($0)]) │└── 17:Project(ID=[$0]) │└── 16:Filter(condition=[=($cor0.C301, $0)]) │└── 15:TableScan(table=[T1], Schema=[ID:Dimension, C101:Decimal(0)]) │}, $4)], joinType=[inner]) ├── 13:TableScan(table=[T3], Schema=[ID:Dimension, C301:Dimension, C302:Dimension]) └── 14:TableScan(table=[T2], Schema=[ID:Dimension, C201:Dimension, C202:Decimal(0), C203:Decimal(0), C204:Decimal(2)]) {code} Using only the JOIN_SUB_QUERY_TO_CORRELATE program as a hep program, this is the resulting tree: {code} 25:Project(ID=[$0], C301=[$1], C302=[$2], ID0=[$3], C201=[$4], C202=[$5], C203=[$6], C204=[$7]) └── 41:Project(ID=[$0], C301=[$1], C302=[$2], ID0=[$3], C201=[$4], C202=[$5], C203=[$6], C204=[$7]) └── 39:Join(condition=[=($8, $4)], joinType=[inner]) ├── 13:TableScan(table=[T3], Schema=[ID:Dimension, C301:Dimension, C302:Dimension]) └── 37:Correlate(correlation=[$cor0], joinType=[left], requiredColumns=[{1}]) ├── 14:TableScan(table=[T2], Schema=[ID:Dimension, C201:Dimension, C202:Decimal(0), C203:Decimal(0), C204:Decimal(2)]) └── 35:Aggregate(group=[{}], EXPR$0=[MAX($0)]) └── 33:Project(ID=[$0]) └── 31:Filter(condition=[=($cor0.C301, $0)]) └── 15:TableScan(table=[T1], Schema=[ID:Dimension, C101:Decimal(0)]) {code} Notice that the original correlation expression is between T1 and T3, but the rule as created a correlate between T1 and T2. I would have expected this tree: {code:none} 25:Project(ID=[$0], C301=[$1], C302=[$2], ID0=[$3], C201=[$4], C202=[$5], C203=[$6], C204=[$7]) └── 41:Project(ID=[$0], C301=[$1], C302=[$2], ID0=[$3], C201=[$4], C202=[$5], C203=[$6], C204=[$7]) └── 39:Join(condition=[=($3, $4)], joinType=[inner]) ├── 37:Correlate(correlation=[$cor0], joinType=[left], requiredColumns=[{1}]) │ ├── 13:TableScan(table=[T3], Schema=[ID:Dimension, C301:Dimension, C302:Dimension]) │ └── 35:Aggregate(group=[{}], EXPR$0=[MAX($0)]) │ └── 33:Project(ID=[$0]) │ └── 31:Filter(condition=[=($cor0.C301, $0)]) │ └── 15:TableScan(table=[T1], Schema=[ID:Dimension, C101:Decimal(0)]) └── 14:TableScan(table=[T2], Schema=[ID:Dimension, C201:Dimension, C202:Decimal(0), C203:Decimal(0), C204:Decimal(2)]) {code} If you swap the sides of the join (making {{`T2 join T3`}}), you do get the correct association, but the correlate has an invalid `requiredColumn`: {code:none} 25:Project(ID=[$0], C201=[$1], C202=[$2], C203=[$3], C204=[$4], ID0=[$5], C301=[$6], C302=[$7]) └── 41:Project(ID=[$0], C201=[$1], C202=[$2], C203=[$3], C204=[$4], ID0=[$5], C301=[$6], C302=[$7]) └── 39:Join(condition=[=($8, $1)], joinType=[inner]) ├── 13:TableScan(table=[T2], Schema=[ID:Dimension, C201:Dimension, C202:Decimal(0), C203:Decimal(0), C204:Decimal(2)]) └── 37:Correlate(correlation=[$cor0], joinType=[left], requiredColumns=[{6}]) ├── 14:TableScan(table=[T3], Schema=[ID:Dimension, C301:Dimension, C302:Dimension]) └── 35:Aggregate(group=[{}], EXPR$0=[MAX($0)]) └── 33:Project(ID=[$0]) └── 31:Filter(condition=[=($cor0.C301, $0)]) └── 15:TableScan(table=[T1], Schema=[ID:Dimension, C101:Decimal(0)]) {code} Here `requiredColumn` should be 1 (pointing to `C301`) but is actually 6, which is where `C301` would be after the join {{`T2 join T3`}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6503) Simple `NOT IN` is not pushed via JdbcAdapter
Corvin Kuebler created CALCITE-6503: --- Summary: Simple `NOT IN` is not pushed via JdbcAdapter Key: CALCITE-6503 URL: https://issues.apache.org/jira/browse/CALCITE-6503 Project: Calcite Issue Type: Bug Components: core, jdbc-adapter Affects Versions: 1.37.0 Reporter: Corvin Kuebler Hey all! Adding the following test to JdbcAdapterTests fails: {code:java} @Test void notInNotPushed() { CalciteAssert.model(JdbcTest.SCOTT_MODEL) .query("select * from dept where deptno not in (select deptno from emp)") .explainMatches("", CalciteAssert.checkResultContains("EnumerableMergeJoin", 0)) .explainMatches("", CalciteAssert.checkResultContains("EnumerableAggregate", 0)) .explainMatches("", CalciteAssert.checkResultContains("EnumerableSort", 0)); } {code} Failure: {noformat} PLAN=EnumerableCalc(expr#0..6=[{inputs}], expr#7=[0], expr#8=[=($t3, $t7)], expr#9=[IS NULL($t6)], expr#10=[>=($t4, $t3)], expr#11=[AND($t9, $t10)], expr#12=[OR($t8, $t11)], proj#0..2=[{exprs}], $condition=[$t12]) EnumerableMergeJoin(condition=[=($0, $5)], joinType=[left]) JdbcToEnumerableConverter JdbcSort(sort0=[$0], dir0=[ASC]) JdbcJoin(condition=[true], joinType=[inner]) JdbcTableScan(table=[[SCOTT, DEPT]]) JdbcAggregate(group=[{}], c=[COUNT()], ck=[COUNT($7)]) JdbcTableScan(table=[[SCOTT, EMP]]) EnumerableSort(sort0=[$0], dir0=[ASC]) EnumerableAggregate(group=[{7}], i=[LITERAL_AGG(true)]) JdbcToEnumerableConverter JdbcTableScan(table=[[SCOTT, EMP]]) should have 0 occurrence of EnumerableMergeJoin Expected :0 Actual :1 {noformat} The test shows, that instead of pushing the statement to the database, it gets rewritten and then evaulated in memory. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6502) Parser loses position information for Expression3
Mihai Budiu created CALCITE-6502: Summary: Parser loses position information for Expression3 Key: CALCITE-6502 URL: https://issues.apache.org/jira/browse/CALCITE-6502 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Mihai Budiu Parser.jj features the following code snipped in the Expression3 production: {code:java} if (list1.size() == 1) { // interpret as single value or query return list1.get(0); } {code} This is problematic because the span of list1 is not the same as the span of its first element. This makes it look like the expression parsed does not include the opening and closing parens. This can be a problem for error reporting tools (but there are other use cases that may be impacted). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6501) Assertion Error in JoinUnifyRule Due to Type Mismatch
kate created CALCITE-6501: - Summary: Assertion Error in JoinUnifyRule Due to Type Mismatch Key: CALCITE-6501 URL: https://issues.apache.org/jira/browse/CALCITE-6501 Project: Calcite Issue Type: Bug Components: core Reporter: kate In some cases, there is an assertion failure during the unifyRule rewriting process due to a type mismatch. If we can't rewrite it, we'd better return null instead of assert. example: {code:java} @Test public void testJoinOnCalcToJoin3011() { String mv = "" + "select \"emps\".\"empid\", \"emps\".\"deptno\", \"depts\".\"deptno\" from\n" + "\"emps\" right join \"depts\"\n" + "on \"emps\".\"deptno\" = \"depts\".\"deptno\""; String query = "" + "select \"A\".\"empid\", \"A\".\"a\", \"A\".\"deptno\", \"depts\".\"deptno\" from\n" + " (select \"empid\", \"deptno\", \"deptno\" \"a\" from \"emps\") A" + " right join \"depts\"\n" + "on \"A\".\"deptno\" = \"depts\".\"deptno\""; sql(mv, query).noMat(); } {code} Trace: {code:java} type mismatch:type1:JavaType(int) NOT NULLtype2:JavaType(class java.lang.Integer)java.lang.AssertionError: type mismatch:type1:JavaType(int) NOT NULLtype2:JavaType(class java.lang.Integer) at org.apache.calcite.util.Litmus.lambda$static$0(Litmus.java:31) at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:2204) at org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:949) at org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:927) at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:125) at org.apache.calcite.rex.RexProgramBuilder.registerInput(RexProgramBuilder.java:303) at org.apache.calcite.rex.RexProgramBuilder.addProject(RexProgramBuilder.java:213) at org.apache.calcite.rex.RexProgram.create(RexProgram.java:235) at org.apache.calcite.rex.RexProgram.create(RexProgram.java:204) at org.apache.calcite.plan.SubstitutionVisitor$JoinOnLeftCalcToJoinUnifyRule.apply(SubstitutionVisitor.java:1265) at org.apache.calcite.plan.SubstitutionVisitor.go(SubstitutionVisitor.java:589) at org.apache.calcite.plan.SubstitutionVisitor.go(SubstitutionVisitor.java:523) at org.apache.calcite.test.MaterializedViewSubstitutionVisitorTest$1.optimize(MaterializedViewSubstitutionVisitorTest.java:88) at org.apache.calcite.test.MaterializedViewTester.checkNoMaterialize(MaterializedViewTester.java:94) at org.apache.calcite.test.MaterializedViewFixture.noMat(MaterializedViewFixture.java:60) at {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6500) Aggregation inside Window clause nodes fails when the query is written with a different syntax
Sree Charan Manamala created CALCITE-6500: - Summary: Aggregation inside Window clause nodes fails when the query is written with a different syntax Key: CALCITE-6500 URL: https://issues.apache.org/jira/browse/CALCITE-6500 Project: Calcite Issue Type: Bug Reporter: Sree Charan Manamala -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6499) Deep Clone SqlNodes
Ian Bertolacci created CALCITE-6499: --- Summary: Deep Clone SqlNodes Key: CALCITE-6499 URL: https://issues.apache.org/jira/browse/CALCITE-6499 Project: Calcite Issue Type: Improvement Reporter: Ian Bertolacci SqlNode.clone is not guaranteed to produce a deep clone. For example, [SqlBasicCall.clone |https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql/SqlBasicCall.java#L130-L132] does not clone the operands, so any in-place modifications of an node at or under the operands is observed in the "cloned" node. There use to be [SqlValidatorUtil.DeepCopier|https://calcite.apache.org/javadocAggregate/org/apache/calcite/sql/validate/SqlValidatorUtil.DeepCopier.html] but that is now deprecated (and its constructor is inaccessible anyways) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6498) Elasticsearch nested mappings do not work
Tim Grein created CALCITE-6498: -- Summary: Elasticsearch nested mappings do not work Key: CALCITE-6498 URL: https://issues.apache.org/jira/browse/CALCITE-6498 Project: Calcite Issue Type: Bug Components: elasticsearch-adapter Reporter: Tim Grein Assignee: Tim Grein Fix For: 1.38.0 `EmbeddedElasticsearchPolicy# applyMapping` tries to apply a nested mapping by adding another `properties` field inside a parent field mapping. In Elasticsearch you need to use `fields` rather than `properties` for a nested mapping to work. I've tested this with ES 7 and ES 6 (only including ES 7 example here in the issue description; for ES 6 you need to wrap everything under "mappings" in "_doc"): ES 7: {code:java} PUT /some_index { "mappings": { "properties": { "some_field": { "type": "text", "properties": { "keyword": { "type": "keyword" } } } } } }{code} will lead to the following error: {code:java} { "error" : { "root_cause" : [ { "type" : "mapper_parsing_exception", "reason" : "Mapping definition for [some_field] has unsupported parameters: [properties : {keyword={type=keyword}}]" } ], "type" : "mapper_parsing_exception", "reason" : "Failed to parse mapping [_doc]: Mapping definition for [some_field] has unsupported parameters: [properties : {keyword={type=keyword}}]", "caused_by" : { "type" : "mapper_parsing_exception", "reason" : "Mapping definition for [some_field] has unsupported parameters: [properties : {keyword={type=keyword}}]" } }, "status" : 400 } {code} Successful request: {code:java} PUT /some_index { "mappings": { "properties": { "some_field": { "type": "text", "fields": { "keyword": { "type": "keyword" } } } } } } {code} You'll encounter this error also, if you adapt the test data in ElasticsearchAdapterTest to include nested fields: {code:java} @BeforeAll public static void setupInstance() throws Exception { // "city.keyword" is a nested field with type "keyword" final Map mapping = ImmutableMap.of("city", "text", "city.keyword", "keyword", "state", "keyword", "pop", "long"); {code} Error: {code:java} {... {"type":"mapper_parsing_exception","reason":"unknown parameter [properties] on mapper [city] of type [text]"} ...} {code} Looking at `ElasticsearchJson#visitMappingProperties` I assume this will lead to a similar issue, which I'll double-check. This is related to https://issues.apache.org/jira/browse/CALCITE-3027 as you want to detect, if you perform a `LIKE` operator on a purely `text` mapped field, which will lead to weird semantics (`text` mapped fields are analyzed/broken up into several tokens), if you do not prevent it. Usually you have a nested `keyword` mapping for a `text` field, which you can fallback to. So this is rather important to work correctly overall and in the tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6497) Use helper setup method throughout whole ElasticsearchAdapterTest
Tim Grein created CALCITE-6497: -- Summary: Use helper setup method throughout whole ElasticsearchAdapterTest Key: CALCITE-6497 URL: https://issues.apache.org/jira/browse/CALCITE-6497 Project: Calcite Issue Type: Improvement Components: tests Affects Versions: 1.37.0 Reporter: Tim Grein Assignee: Tim Grein Fix For: 1.38.0 There's a helper method `calciteAssert` inside the `ElasticsearchAdapterTest`, which is responsible for setting up a connection to Elasticsearch. This helper method is not used throughout the test class consistently, which leads to code duplication and some manual setup code. We should use the helper method to keep things consistent. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6496) Enable tests from outer.iq
Mihai Budiu created CALCITE-6496: Summary: Enable tests from outer.iq Key: CALCITE-6496 URL: https://issues.apache.org/jira/browse/CALCITE-6496 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: Mihai Budiu Many tests in outer.iq are disabled with this comment: {code} # The following test is disabled, because we cannot handle non-equi-join. # Following it are the results from MySQL. !if (false) { {code} Most of these tests could probably be enabled, but they require minor formatting. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6495) Allow ProjectSetOpTransposeRule to work with any subclass of Project
Ruben Q L created CALCITE-6495: -- Summary: Allow ProjectSetOpTransposeRule to work with any subclass of Project Key: CALCITE-6495 URL: https://issues.apache.org/jira/browse/CALCITE-6495 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: Ruben Q L Fix For: 1.38.0 Even if the default configuration of ProjectSetOpTransposeRule matches a LogicalProject, theoretically any downstream project could adjust the rule config to match a different subclass of Project, with the corresponding RelBuilderFactory, to obtain the rule behavior customized for their needs. However, at this point this cannot work because ProjectSetOpTransposeRule#onMatch performs a {code:java} final LogicalProject origProject = call.rel(0); {code} which leads to a {{ClassCastException}} in this scenario. Therefore, this line should be changed (and generalized) into {code:java} final Project origProject = call.rel(0); {code} (as it happens already in other rules, such as FilterSetOpTransposeRule or ProjectFilterTransposeRule) to improve the rule's adaptability, without impacting the rule's behavior. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6494) MongoAdapter throws an exception if any function unsupported by MongoProject is used
Dmitry Sysolyatin created CALCITE-6494: -- Summary: MongoAdapter throws an exception if any function unsupported by MongoProject is used Key: CALCITE-6494 URL: https://issues.apache.org/jira/browse/CALCITE-6494 Project: Calcite Issue Type: Bug Components: mongodb-adapter Affects Versions: 1.37.0 Reporter: Dmitry Sysolyatin MongoAdapter throws an exception if any function unsupported by MongoProject is used A simple example [1]: {code} @Test void testCalciteFunc() { assertModel(MODEL) .query("select CHAR_LENGTH(state) from zips") .runs(); } {code} throws {code} Suppressed: java.lang.IllegalArgumentException: Translation of CHAR_LENGTH(CAST(ITEM($0, 'state')):VARCHAR(2)) is not supported by MongoProject {code} The reason is that MongoProjectRule does not properly filter projects from LogicalProjects.getNamedProjects that can be used. I think the proper implementation of this rule should be similar to DruidProjectRule. It should extract input references from the project if it cannot be fully supported and add an additional Project on top of MongoProject. [1] - https://github.com/dssysolyatin/calcite/commit/96cddae9c6516701e1aa44f7c75df5afdc497f8d -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6493) Improve Calcite development documentation
Caican Cai created CALCITE-6493: --- Summary: Improve Calcite development documentation Key: CALCITE-6493 URL: https://issues.apache.org/jira/browse/CALCITE-6493 Project: Calcite Issue Type: Improvement Reporter: Caican Cai I noticed that in the jira submission process (like me), some terms are not standardized, such as Mysql -> MySQL, arrow -> Spark I think we can explain it in the document -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6492) Support aggregate functions which could process DISTINCT natively
Zoltan Haindrich created CALCITE-6492: - Summary: Support aggregate functions which could process DISTINCT natively Key: CALCITE-6492 URL: https://issues.apache.org/jira/browse/CALCITE-6492 Project: Calcite Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich This could be usefull if the execution engine natively supports some distinct aggregations natively - there is no rewrite necessary for these functions. Currently there is support [SqlAggFunction#getDistinctOptionality|https://github.com/apache/calcite/blob/0deab6f7e0cb4ec63eae8b59477d6f0fadfd11e8/core/src/main/java/org/apache/calcite/sql/SqlAggFunction.java#L187-L189] - which have overlaps with this - possibly the closest would be to set it to *IGNORED* if its supported natively...however * that's a bit misleading as its not IGNORED; but supported... * there is also [checkArgument|https://github.com/apache/calcite/blob/0deab6f7e0cb4ec63eae8b59477d6f0fadfd11e8/core/src/main/java/org/apache/calcite/rel/core/AggregateCall.java#L125] which ensures that *distinct* is not accepted in tht case. More or less the end result would be to also enhance AggregateExpandDistinctAggregatesRule with the ability to ignore aggregates. note: In Druid * if approximationCountDistinct is disabled ; that [enables a calcite rule which rewrites *all* disitnct aggregates|https://github.com/apache/druid/blob/master/sql/src/main/java/org/apache/druid/sql/calcite/planner/CalciteRulesManager.java#L496-L503] * in the meantime there are also some aggregate functions which support *distinct* natively like [string_agg](https://github.com/apache/druid/blob/c9aae9d8e683c0cc9c4687e526b8270f744c57c2/sql/src/main/java/org/apache/druid/sql/calcite/aggregation/builtin/StringSqlAggregator.java#L154) - which doesn't need any rewrites -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6491) SqlOperatorFixture's checkString doesn't check the actual value against the expected result
xiong duan created CALCITE-6491: --- Summary: SqlOperatorFixture's checkString doesn't check the actual value against the expected result Key: CALCITE-6491 URL: https://issues.apache.org/jira/browse/CALCITE-6491 Project: Calcite Issue Type: Improvement Reporter: xiong duan The Unit Test: {code:java} @Test void testOverlayFunc() { final SqlOperatorFixture f = fixture(); f.setFor(SqlStdOperatorTable.OVERLAY, VmName.EXPAND); f.checkString("overlay('ABCdef' placing 'abc' from 1)", "abcd", "VARCHAR(9) NOT NULL"); } {code} {code:java} @Test void testOverlayFunc() { final SqlOperatorFixture f = fixture(); f.setFor(SqlStdOperatorTable.OVERLAY, VmName.EXPAND); f.checkString("overlay('ABCdef' placing 'abc' from 1)", "abcdef", "VARCHAR(9) NOT NULL"); } {code} Both test cases can run successfully. I checked the code and the default implementation only verifies the result type. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6490) Missing tests for SqlFunctions#overlay
Tim Grein created CALCITE-6490: -- Summary: Missing tests for SqlFunctions#overlay Key: CALCITE-6490 URL: https://issues.apache.org/jira/browse/CALCITE-6490 Project: Calcite Issue Type: Improvement Components: tests Affects Versions: 1.37.0 Reporter: Tim Grein Assignee: Tim Grein Fix For: 1.38.0 After making some improvements to the OVERLAY operator docs in [Fix error message typo and improve docs for OVERLAY operator|[http://example.com|https://github.com/apache/calcite/pull/3877]] I've noticed that there are no tests for SqlFunctions#overlay in SqlFunctionsTest. As checkString in SqlOperatorTest doesn't check the actual value against the expected result (at least that's how I understood it) I thought it's even more important to add tests for the OVERLAY operator in SqlFunctionsTest. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6489) Add more data types to the Arrow test set
Caican Cai created CALCITE-6489: --- Summary: Add more data types to the Arrow test set Key: CALCITE-6489 URL: https://issues.apache.org/jira/browse/CALCITE-6489 Project: Calcite Issue Type: Sub-task Reporter: Caican Cai Currently, the Arrow test set only has basic data types, such as int, long, etc., but does not have smallInt, tinyint, etc. It is possible to improve the Arrow test set -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6488) Ensure collations created by RelCollations are canonized once
Ruben Q L created CALCITE-6488: -- Summary: Ensure collations created by RelCollations are canonized once Key: CALCITE-6488 URL: https://issues.apache.org/jira/browse/CALCITE-6488 Project: Calcite Issue Type: Task Components: core Affects Versions: 1.37.0 Reporter: Ruben Q L Fix For: 1.38.0 RelCollations#of canonizes internally the result before returning it; however some callers unnecessarily re-canonize the result on their end. RelCollations#shift does not canonize its result, some callers canonize on their end while others don't. It is proposed to align their behaviour: RelCollations#of and RelCollations#shift will canonize internally their results before return, so that their callers do not need to do it on their end. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6487) TRIM function documentation refers to string1 two times
Tim Grein created CALCITE-6487: -- Summary: TRIM function documentation refers to string1 two times Key: CALCITE-6487 URL: https://issues.apache.org/jira/browse/CALCITE-6487 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Tim Grein Assignee: Tim Grein The docs for 'TRIM' state the following: "Removes the longest string containing only the characters in _string1_ from the start/end/both ends of _string1."_ I think it should be the following to be accurate (changed last word from _string1_ to {_}string2{_}): "Removes the longest string containing only the characters in _string1_ from the start/end/both ends of _string2."_ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6486) Make UnifyRule Implementations Public
kate created CALCITE-6486: - Summary: Make UnifyRule Implementations Public Key: CALCITE-6486 URL: https://issues.apache.org/jira/browse/CALCITE-6486 Project: Calcite Issue Type: Wish Components: core Reporter: kate More and more database engines are now using and advancing materialised views. I've noticed that calcite hasn't been updated on this in a long time. We should make the rules exposed for external use and optimisation -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6485) IN-list contains NULL throws Exception
xiong duan created CALCITE-6485: --- Summary: IN-list contains NULL throws Exception Key: CALCITE-6485 URL: https://issues.apache.org/jira/browse/CALCITE-6485 Project: Calcite Issue Type: Improvement Reporter: xiong duan {code:java} @Test void testInOperation() { sql("select 1 in (null, '2', '3') as f0 from (values (true, true, true))").ok(); } {code} {code:java} Conversion to relational algebra failed to preserve datatypes: validated type: RecordType(BOOLEAN NOT NULL F0) NOT NULL converted type: RecordType(BOOLEAN F0) NOT NULL rel: LogicalValues(tuples=[[{ null }]]) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6484) Sonar analysis fails intermittently due to OOM
Alessandro Solimando created CALCITE-6484: - Summary: Sonar analysis fails intermittently due to OOM Key: CALCITE-6484 URL: https://issues.apache.org/jira/browse/CALCITE-6484 Project: Calcite Issue Type: Bug Components: tests Affects Versions: 1.37.0 Reporter: Alessandro Solimando Assignee: Alessandro Solimando Sometimes the Sonar analysis fails due to OOM, here an [example of CI logs|https://ci-builds.apache.org/blue/organizations/jenkins/Calcite%2FCalcite-sonar/detail/PR-3858/9/pipeline/]. In what follows the relevant extract: {code:java} The Daemon will expire after the build after running out of JVM heap space. The project memory settings are likely not configured or are configured to an insufficient value. The daemon will restart for the next build, which may increase subsequent build times. These settings can be adjusted by setting 'org.gradle.jvmargs' in 'gradle.properties'. The currently configured max heap space is '1 GiB' and the configured max metaspace is '512 MiB'. For more information on how to set these values, please refer to https://docs.gradle.org/8.7/userguide/build_environment.html#sec:configuring_jvm_memory in the Gradle documentation. To disable this warning, set 'org.gradle.daemon.performance.disable-logging=true'. > Task :sonar FAILED Build calcite FAILURE reason: Execution failed for task ':sonar': java.lang.OutOfMemoryError: Java heap space at B.A.A.A.A.F.(na:2438) at B.A.A.A.A.D$_D.(na:1146) at B.A.A.A.A.D.iterator(na:1079) at com.sonarsource.A.A.E.G$_C.D(na:2233) at com.sonarsource.A.A.C.E.A(na:972) at com.sonarsource.A.A.C.E.C(na:2276) at com.sonarsource.A.A.C.E.B(na:3115) at com.sonarsource.A.A.Y.A(na:3332) at com.sonarsource.A.A.Y.A(na:2596) at com.sonarsource.A.A.Y.E(na:1668) at com.sonarsource.A.A.Y.A(na:943) at com.sonarsource.A.A.Y.A(na:377) at com.sonarsource.A.A.Y.A(na:2750) at com.sonarsource.A.H.executeChecksOnFunction(na:1449) at com.sonarsource.A.H.executeChecks(na:2587) at com.sonarsource.A.H.executeSensor(na:3171) at com.sonarsource.A.H.execute(na:1926) at org.sonar.scanner.sensor.AbstractSensorWrapper.analyse(AbstractSensorWrapper.java:63) at org.sonar.scanner.sensor.ModuleSensorsExecutor.execute(ModuleSensorsExecutor.java:75) at org.sonar.scanner.sensor.ModuleSensorsExecutor.execute(ModuleSensorsExecutor.java:51) at org.sonar.scanner.scan.ModuleScanContainer.doAfterStart(ModuleScanContainer.java:64) at org.sonar.core.platform.ComponentContainer.startComponents(ComponentContainer.java:123) at org.sonar.core.platform.ComponentContainer.execute(ComponentContainer.java:109) at org.sonar.scanner.scan.ProjectScanContainer.scan(ProjectScanContainer.java:192) at org.sonar.scanner.scan.ProjectScanContainer.scanRecursively(ProjectScanContainer.java:188) at org.sonar.scanner.scan.ProjectScanContainer.doAfterStart(ProjectScanContainer.java:159) at org.sonar.core.platform.ComponentContainer.startComponents(ComponentContainer.java:123) at org.sonar.core.platform.ComponentContainer.execute(ComponentContainer.java:109) at org.sonar.scanner.bootstrap.ScannerContainer.doAfterStart(ScannerContainer.java:399) at org.sonar.core.platform.ComponentContainer.startComponents(ComponentContainer.java:123) at org.sonar.core.platform.ComponentContainer.execute(ComponentContainer.java:109) at org.sonar.scanner.bootstrap.GlobalContainer.doAfterStart(GlobalContainer.java:131) FAILURE: Build failed with an exception. * What went wrong: Execution failed for task ':sonar'. > Java heap space {code} At the top of the quoted text, it's recommended to increase the memory settings via the parameter 'org.gradle.jvmargs' in 'gradle.properties'. Since most of the time the analysis succeeds, we probably need a small increment at first to see if the situation improves (max heap space is now '1 GiB', max metaspace is '512 MiB'). It's detrimental to have CI failures as they not only deprive us from the sonar analysis, but they also contributed to the sentiment that flakyness in CI is OK, while we should aim at 100% green CI unless there is a real problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6483) Add position and length to PostgreSQL function library
Norman Jordan created CALCITE-6483: -- Summary: Add position and length to PostgreSQL function library Key: CALCITE-6483 URL: https://issues.apache.org/jira/browse/CALCITE-6483 Project: Calcite Issue Type: Improvement Components: core Reporter: Norman Jordan Assignee: Norman Jordan PostgreSQL supports the *POSITION* and *LENGTH* string functions. [https://www.postgresql.org/docs/14/functions-string.html] Also check other libraries to see if those functions should be included. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6482) OracleDialect support bool literal as predicate
kate created CALCITE-6482: - Summary: OracleDialect support bool literal as predicate Key: CALCITE-6482 URL: https://issues.apache.org/jira/browse/CALCITE-6482 Project: Calcite Issue Type: Improvement Components: core Reporter: kate -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6481) Optimize 'VALUES...UNION ALL...VALUES' to a single 'VALUES' when IN-list that the values include NULL is converted to Values
xiong duan created CALCITE-6481: --- Summary: Optimize 'VALUES...UNION ALL...VALUES' to a single 'VALUES' when IN-list that the values include NULL is converted to Values Key: CALCITE-6481 URL: https://issues.apache.org/jira/browse/CALCITE-6481 Project: Calcite Issue Type: Improvement Reporter: xiong duan The SQL: {code:java} with t1(a,y) as (select * from (values (1, 2), (3, null),(7369,null),(7499,30),(null, 20),(null, 5)) as t1) select * from t1 where (t1.a,t1.y) in ((1, 2), (3, null),(7369,null),(7499,30),(null, 20),(null, 5)); EnumerableHashJoin(condition=[AND(=($0, $2), =($1, $3))], joinType=[semi]) EnumerableValues(tuples=[[{ 1, 2 }, { 3, null }, { 7369, null }, { 7499, 30 }, { null, 20 }, { null, 5 }]]) EnumerableUnion(all=[true]) EnumerableValues(tuples=[[{ 3, null }]]) EnumerableValues(tuples=[[{ 7369, null }]]) EnumerableValues(tuples=[[{ null, 20 }]]) EnumerableValues(tuples=[[{ null, 5 }]]) EnumerableValues(tuples=[[{ 1, 2 }, { 7499, 30 }]]) !plan with t1(a,y) as (select * from (values (1, 2), (3, 5),(7369,6),(7499,30),(2, 20),(3, 5)) as t1) select * from t1 where (t1.a,t1.y) in ((1, 2), (3, 3),(7369,3),(7499,30),(1, 20),(3, 5)); EnumerableHashJoin(condition=[AND(=($0, $2), =($1, $3))], joinType=[semi]) EnumerableValues(tuples=[[{ 1, 2 }, { 3, 5 }, { 7369, 6 }, { 7499, 30 }, { 2, 20 }, { 3, 5 }]]) EnumerableValues(tuples=[[{ 1, 2 }, { 3, 3 }, { 7369, 3 }, { 7499, 30 }, { 1, 20 }, { 3, 5 }]]) !plan {code} If the IN-list includes NULL, Calcite will convert VALUES to UNION ALL. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6480) OracleDialect does not support CASE WHEN returning boolean
kate created CALCITE-6480: - Summary: OracleDialect does not support CASE WHEN returning boolean Key: CALCITE-6480 URL: https://issues.apache.org/jira/browse/CALCITE-6480 Project: Calcite Issue Type: Wish Components: core Reporter: kate Our requirement is to use Calcite to translate queries into different dialects. During validation in the TPC-DS scenario, we found that {{Oracle}} does not support SQL statements like {{}} {code:java} SELECT * FROM xxx WHERE CASE WHEN a > 10 THEN b < 5 ELSE c > 0 END;{code} {{}} Therefore, we hope to remove such predicates at the dialect like Oracle. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6479) OracleDialect does not support CASE WHEN returning boolean
kate created CALCITE-6479: - Summary: OracleDialect does not support CASE WHEN returning boolean Key: CALCITE-6479 URL: https://issues.apache.org/jira/browse/CALCITE-6479 Project: Calcite Issue Type: Wish Components: core Reporter: kate Our requirement is to use Calcite to translate queries into different dialects. During validation in the TPC-DS scenario, we found that {{Oracle}} does not support SQL statements like {{}} {code:java} SELECT * FROM xxx WHERE CASE WHEN a > 10 THEN b < 5 ELSE c > 0 END;{code} {{}} Therefore, we hope to remove such predicates at the dialect like Oracle. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6477) OracleDialect does not support CASE WHEN returning boolean
kate created CALCITE-6477: - Summary: OracleDialect does not support CASE WHEN returning boolean Key: CALCITE-6477 URL: https://issues.apache.org/jira/browse/CALCITE-6477 Project: Calcite Issue Type: Wish Components: core Reporter: kate Our requirement is to use Calcite to translate queries into different dialects. During validation in the TPC-DS scenario, we found that {{Oracle}} does not support SQL statements like {{}} {code:java} SELECT * FROM xxx WHERE CASE WHEN a > 10 THEN b < 5 ELSE c > 0 END;{code} {{}} Therefore, we hope to remove such predicates at the dialect like Oracle. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6478) JSON functions should return NULL when input is NULL
xiong duan created CALCITE-6478: --- Summary: JSON functions should return NULL when input is NULL Key: CALCITE-6478 URL: https://issues.apache.org/jira/browse/CALCITE-6478 Project: Calcite Issue Type: Improvement Reporter: xiong duan Here is the test in H2: {code:java} SELECT js, js IS JSON"json?", js IS JSON value "json?", js IS JSON SCALAR "scalar?", js IS JSON OBJECT "object?", js IS JSON ARRAY "array?" FROM (VALUES ('123'), ('"abc"'), ('{"a": "b"}'), ('[1,2]'), ('abc'), NULL) foo(js); {code} {code:java} +--+-+-+---+---+--+ |JS|json?|json?|scalar?|object?|array?| +--+-+-+---+---+--+ |123 |true |true |true |false |false | |"abc" |true |true |true |false |false | |{"a": "b"}|true |true |false |true |false | |[1,2] |true |true |false |false |true | |abc |false|false|false |false |false | |null |null |null |null |null |null | +--+-+-+---+---+--+ {code} Same SQL in Calcite: {code:java} ++---+---+-+-++ | JS | json? | json? | scalar? | object? | array? | ++---+---+-+-++ | "abc" | true | true | true| false | false | | 123| true | true | true| false | false | | [1,2] | true | true | false | false | true | | abc| false | false | false | false | false | | {"a": "b"} | true | true | false | true| false | || false | false | false | false | false | ++---+---+-+-++ {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6476) OracleDialect does not support CASE WHEN returning boolean
kate created CALCITE-6476: - Summary: OracleDialect does not support CASE WHEN returning boolean Key: CALCITE-6476 URL: https://issues.apache.org/jira/browse/CALCITE-6476 Project: Calcite Issue Type: Wish Components: core Reporter: kate Our requirement is to use Calcite to translate queries into different dialects. During validation in the TPC-DS scenario, we found that {{Oracle}} does not support SQL statements like {{}} {code:java} SELECT * FROM xxx WHERE CASE WHEN a > 10 THEN b < 5 ELSE c > 0 END;{code} {{}} Therefore, we hope to remove such predicates at the dialect like Oracle. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6475) RelToSql converter fails when IN-list that the values include NULL is converted to Values
xiong duan created CALCITE-6475: --- Summary: RelToSql converter fails when IN-list that the values include NULL is converted to Values Key: CALCITE-6475 URL: https://issues.apache.org/jira/browse/CALCITE-6475 Project: Calcite Issue Type: Bug Reporter: xiong duan The SQL {code:java} select * from "product" where ("product_id") in (12, null){code} When we convert the IN-list to OR condition, it can run successfully. But when we convert it to VALUES, it will throw NPE: {code:java} java.lang.NullPointerException at org.apache.calcite.rel.rel2sql.SqlImplementor.toSql(SqlImplementor.java:1468) at org.apache.calcite.rel.rel2sql.SqlImplementor.toSql(SqlImplementor.java:1416) at org.apache.calcite.rel.rel2sql.SqlImplementor$Context.toSql(SqlImplementor.java:702) at org.apache.calcite.rel.rel2sql.RelToSqlConverter.lambda$exprList$17(RelToSqlConverter.java:1158){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6474) Aggregate with constant key can get a RowCount greater than its MaxRowCount
Ruben Q L created CALCITE-6474: -- Summary: Aggregate with constant key can get a RowCount greater than its MaxRowCount Key: CALCITE-6474 URL: https://issues.apache.org/jira/browse/CALCITE-6474 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Ruben Q L Assignee: Ruben Q L Fix For: 1.38.0 An Aggregate with constant key can get a RowCount greater than its MaxRowCount. The root cause is that this logic in RelMdMaxRowCount {code} // Aggregate with constant GROUP BY always returns 1 row if (rel.getGroupType() == Aggregate.Group.SIMPLE) { final RelOptPredicateList predicateList = mq.getPulledUpPredicates(rel.getInput()); if (!RelOptPredicateList.isEmpty(predicateList) && allGroupKeysAreConstant(rel, predicateList)) { return 1D; } } {code} is not applied in RelMdRowCount. Therefore we can get an Aggregate whose MaxRowCount is 1, but its RowCount is X (> 1). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6473) HAVING clauses may not contain window functions
Zoltan Haindrich created CALCITE-6473: - Summary: HAVING clauses may not contain window functions Key: CALCITE-6473 URL: https://issues.apache.org/jira/browse/CALCITE-6473 Project: Calcite Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich according to the standard: {code} The shall not contain a without an intervening . {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6472) Add degree based trig functions to PostgreSQL function library
Norman Jordan created CALCITE-6472: -- Summary: Add degree based trig functions to PostgreSQL function library Key: CALCITE-6472 URL: https://issues.apache.org/jira/browse/CALCITE-6472 Project: Calcite Issue Type: Improvement Components: core Reporter: Norman Jordan Assignee: Norman Jordan PostgreSQL supports the following trigonometric functions that are degree based. * COSD * SIND * TAND * ACOSD * ASIND * ATAND -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6471) Prevent unconditional conversion of sqlNodes to string for null-check messages in SqlToRelConverter
Konstantin Orlov created CALCITE-6471: - Summary: Prevent unconditional conversion of sqlNodes to string for null-check messages in SqlToRelConverter Key: CALCITE-6471 URL: https://issues.apache.org/jira/browse/CALCITE-6471 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: Konstantin Orlov Currently, {{SqlToRelConverter}} contains a few places where {{sqlNode}} is concatenated to some prefix and passed as plain string rather that message supplier as parameter to {{requireNonNull}} method. These are, namely, lines 1019, 4031, and 5344 at commit 29c413a3: {code:java} SqlNode elseOperand = requireNonNull(caseNode.getElseOperand(), "getElseOperand for " + caseNode); {code} {code:java} protected RelOptTable getTargetTable(SqlNode call) { final SqlValidatorNamespace targetNs = getNamespace(call); SqlValidatorNamespace namespace; if (targetNs.isWrapperFor(SqlValidatorImpl.DmlNamespace.class)) { namespace = targetNs.unwrap(SqlValidatorImpl.DmlNamespace.class); } else { namespace = targetNs.resolve(); } RelOptTable table = SqlValidatorUtil.getRelOptTable(namespace, catalogReader, null, null); return requireNonNull(table, "no table found for " + call); } {code} {code:java} return Pair.of(c, (e, fieldName) -> { final int j = requireNonNull(fieldMap.get(fieldName), "field " + fieldName); return rexBuilder.makeFieldAccess(e, j); }); {code} We can shave a few microseconds by simply replacing plain string with message supplier. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6470) Run specific JMH benchmarks without modifying sources
Stamatis Zampetakis created CALCITE-6470: Summary: Run specific JMH benchmarks without modifying sources Key: CALCITE-6470 URL: https://issues.apache.org/jira/browse/CALCITE-6470 Project: Calcite Issue Type: Improvement Components: build Reporter: Stamatis Zampetakis Assignee: Stamatis Zampetakis Fix For: 1.38.0 Currently we can run *all* JMH benchmarks of the project using the "jmh" Gradle task. {code:java} ./gradlew :ubenchmark:jmh {code} In many cases, we don't want to run all benchmarks but only one (or few) and this is not possible without modifying the Gradle build files (see https://github.com/melix/jmh-gradle-plugin/issues/152). The goal of this ticket is to add a (project) property (i.e., jmh.includes) to select the desired benchmarks via a regular expression that can be passed in the command line. {code:java} ./gradlew :ubenchmark:jmh -Pjmh.includes=ParserBenchmark {code} For example, the command above would run only the ParserBenchmark. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6469) Join on condition generates wrong plan when the condition is IN sub-query
xiong duan created CALCITE-6469: --- Summary: Join on condition generates wrong plan when the condition is IN sub-query Key: CALCITE-6469 URL: https://issues.apache.org/jira/browse/CALCITE-6469 Project: Calcite Issue Type: Bug Affects Versions: 1.37.0 Reporter: xiong duan Assignee: xiong duan Fix For: 1.38.0 Now Calcite will throw an exception when we execute the SQL: {code:java} select * from scott.bonus b left join scott.emp e on e.deptno in (select deptno from scott.dept b){code} I will try to rewrite this SQL become: {code:java} SELECT * FROM "SCOTT"."BONUS" LEFT JOIN ("SCOTT"."EMP" INNER JOIN (SELECT "DEPTNO" FROM "SCOTT"."DEPT" GROUP BY "DEPTNO") AS "t" ON "EMP"."DEPTNO" = "t"."DEPTNO") ON TRUE{code} The SQL : {code:java} select * from scott.emp e left join scott.bonus b on e.deptno in (select deptno from scott.dept b) {code} I will try to rewrite this SQL become: {code:java} SELECT * FROM "SCOTT"."EMP" LEFT JOIN ("SCOTT"."BONUS" CROSS JOIN (SELECT "DEPTNO" FROM "SCOTT"."DEPT" GROUP BY "DEPTNO") AS "t") ON "EMP"."DEPTNO" = "t"."DEPTNO" {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6468) RelDecorrelator throws AssertionError if correlated variable is used as Aggregate group key
Ruben Q L created CALCITE-6468: -- Summary: RelDecorrelator throws AssertionError if correlated variable is used as Aggregate group key Key: CALCITE-6468 URL: https://issues.apache.org/jira/browse/CALCITE-6468 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Ruben Q L Assignee: Ruben Q L Fix For: 1.38.0 The problem can be reproduced with this query (a "simplified" version of TPC-DS query1): {code:sql} WITH agg_sal AS (SELECT deptno, sum(sal) AS total FROM emp GROUP BY deptno) SELECT 1 FROM agg_sal s1 WHERE s1.total > (SELECT avg(total) FROM agg_sal s2 WHERE s1.deptno = s2.deptno) {code} If we apply subquery program, FilterAggregateTransposeRule and then we call the RelDecorrelator, it will fail with: {noformat} java.lang.AssertionError at org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:581) at org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:495) ... {noformat} The problem appears in this assert (RelDecorrelator.java:581): {code} assert newPos == newInputOutput.size(); {code} The root cause seems to be that, a few lines before, when processing the correlating variables from {{corDefOutputs}} a certain value is inserted in {{mapNewInputToProjOutputs}}: {code} if (!frame.corDefOutputs.isEmpty()) { for (Map.Entry entry : frame.corDefOutputs.entrySet()) { RexInputRef.add2(projects, entry.getValue(), newInputOutput); corDefOutputs.put(entry.getKey(), newPos); mapNewInputToProjOutputs.put(entry.getValue(), newPos); // <-- HERE newPos++; } } {code} The problem is that this value was already in the map, as it had been inserted previously as part of the group key processing: {code} for (int i = 0; i < oldGroupKeyCount; i++) { final int idx = groupKeyIndices.get(i); ... // add mapping of group keys. outputMap.put(idx, newPos); int newInputPos = requireNonNull(frame.oldToNewOutputs.get(idx)); RexInputRef.add2(projects, newInputPos, newInputOutput); mapNewInputToProjOutputs.put(newInputPos, newPos); // <-- HERE added firstly newPos++; } {code} Therefore, the unnecessary insertion into {{mapNewInputToProjOutputs}} and the subsequent increment of {{newPos}} when the {{CorDef}}s are processed leads to the mismatch. Notice how, right before the assertion, when processing the remaining fields, it is verified that the value is not already contained on the {{mapNewInputToProjOutputs}}: {code} // add the remaining fields final int newGroupKeyCount = newPos; for (int i = 0; i < newInputOutput.size(); i++) { if (!mapNewInputToProjOutputs.containsKey(i)) { // <-- HERE checked RexInputRef.add2(projects, i, newInputOutput); mapNewInputToProjOutputs.put(i, newPos); newPos++; } } {code} Thus, probably the solution would be to apply the same logic when the CorDef are processed: {code} if (!frame.corDefOutputs.isEmpty()) { for (Map.Entry entry : frame.corDefOutputs.entrySet()) { Integer pos = mapNewInputToProjOutputs.get(entry.getValue()); if (pos == null) { RexInputRef.add2(projects, entry.getValue(), newInputOutput); corDefOutputs.put(entry.getKey(), newPos); mapNewInputToProjOutputs.put(entry.getValue(), newPos); newPos++; } else { corDefOutputs.put(entry.getKey(), pos); } } } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6467) Performance of RelMdUtil.checkInputForCollationAndLimit when using `where col in (large literal set)`
Gonzalo Ortiz created CALCITE-6467: -- Summary: Performance of RelMdUtil.checkInputForCollationAndLimit when using `where col in (large literal set)` Key: CALCITE-6467 URL: https://issues.apache.org/jira/browse/CALCITE-6467 Project: Calcite Issue Type: Improvement Affects Versions: 1.37.0, 1.36.0, 1.35.0, 1.34.0, 1.33.0, 1.32.0, 1.38.0 Reporter: Gonzalo Ortiz Attachments: MultistageEngineQuickStart_2024_07_12_111558.jfr, image-2024-07-12-15-58-33-504.png Recently we have updated Pinot to use Calcite 1.37. Previously we were using 1.31. After the upgrade, we have found issues when executing some queries that include large IN clauses. Queries like: {code:java} explain plan for SELECT DestCityName FROM ( SELECT DestCityName FROM airlineStats WHERE DestCityName IN ( 'a1', 'a2', 'a3', ... 'a300' ) GROUP BY DestCityName ) as a {code} After some debug, we have found that the issue is in one of our custom rules ( PinotSortExchangeCopyRule) when we call `RelMdUtil.checkInputForCollationAndLimit` ([link|https://github.com/apache/pinot/blob/dacc6d06907c44e83721454f1090e5f00c824f15/pinot-query-planner/src/main/java/org/apache/pinot/calcite/rel/rules/PinotSortExchangeCopyRule.java#L64]). Comparing different Pinot versions, I've found out that the change that is causing problems in our scenario is #CALCITE-5036 (see [https://github.com/apache/calcite/pull/2743)|https://github.com/apache/calcite/pull/2743/files).]. Specifically, it looks like the problem appears when `RelMdPredicates` tries to simplify the expression, which at the time is basically an `OR(DestCityName = a1, DestCityName = a2, ...)`. `RexSimplify.simplifyOrTerms` negates previous terms in order to apply possible optimizations, but in cases like this where we have hundreds of literals, that is very expensive. Going more in detail, it looks like most of the time is invested in creating new ranges: !image-2024-07-12-15-58-33-504.png! You may find more insights in the attached JFR file [^MultistageEngineQuickStart_2024_07_12_111558.jfr][^MultistageEngineQuickStart_2024_07_12_111558.jfr] I've tried to reproduce the problem with `sqline` but I wasn't able to do so. As far as I can see in the code, RelMdUtil.checkInputForCollationAndLimit is only called in SortJoinCopyRule, SortJoinTransposeRule and SortUnionTransposeRule. I've tried to create a test or JMH benchmark in Calcite to try to reproduce the issue, but I don't know codebase well enough. I don't consider myself an expert on Apache Calcite and I know we are not using Calcite in the most standard way (we are slowly migrating from our own engine to Calcite), but I'm pretty confident this issue may also affect other Calcite usages. At least in the trace I cannot see anything Pinot specific. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6466) Sql Parsers instantiation is slow
Cyril de Catheu created CALCITE-6466: Summary: Sql Parsers instantiation is slow Key: CALCITE-6466 URL: https://issues.apache.org/jira/browse/CALCITE-6466 Project: Calcite Issue Type: Improvement Reporter: Cyril de Catheu *Summary: * Sql Parsers instantiation is slow because it instantiates a Throwable with a stacktrace. The speed of instantiation also depends on the call stack depth at the time of instantiation. The deeper in the call stack, the slower the instantiation. *Analysis:* This is an issue caused by JavaCC. The issue is fixed in newer version of JavaCC. Once JavaCC is upgraded to 7.0.13 (https://issues.apache.org/jira/browse/CALCITE-5541), this benchmark can be re-run. See full discussion here: https://lists.apache.org/thread/xw35sdy1w1k8lvn1q1lr7xb93bkj0lpq *TODO*: Add a benchmark to measure the evolution of this issue. The benchmark can be re-run once https://issues.apache.org/jira/browse/CALCITE-5541 is done. Expected improvements are: - a minor speed improvement of ~10% for all parsers impacted by the issue - the instantiation of parsers should not depend on the call stack depth anymore -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6465) Rework code generator
James Duong created CALCITE-6465: Summary: Rework code generator Key: CALCITE-6465 URL: https://issues.apache.org/jira/browse/CALCITE-6465 Project: Calcite Issue Type: New Feature Components: core Reporter: James Duong Holistically replace the (or provide a separate optional) code generator to reduce issues such as CALCITE-3094 . One suggestion has been to use the code generator from Flink. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6464) Type inference for DECIMAL division seems incorrect
Mihai Budiu created CALCITE-6464: Summary: Type inference for DECIMAL division seems incorrect Key: CALCITE-6464 URL: https://issues.apache.org/jira/browse/CALCITE-6464 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Mihai Budiu This bug surfaces if one uses a custom type system, e.g., where DECIMAL is limited to (28, 10). The problem is in RelDataTypeSystem.deriveDecimalDivideType. The JavaDoc of this function gives the algorithm for deriving the division result type. According to these rules, if you divide two numbers of type DECIMAL(28, 10), you should get a result with type DECIMAL(28, 10). But the actual implementation infers a type of DECIMAL(28, 0), which seems incorrect. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6463) alias generation issue about correlated subquery handling
shiji.mei created CALCITE-6463: -- Summary: alias generation issue about correlated subquery handling Key: CALCITE-6463 URL: https://issues.apache.org/jira/browse/CALCITE-6463 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: shiji.mei Fix For: 1.38.0 original query: SELECT MAX(distinct l_orderkey) FROM lineitem where exists( SELECT MAX(c_custkey) FROM customer where c_custkey = l_orderkey GROUP BY c_custkey); Rewritten query: SELECT MAX("l_orderkey") FROM "lineitem" WHERE EXISTS (SELECT "c_custkey", MAX("c_custkey") FROM "customer" WHERE "c_custkey" = "lineitem12"."l_orderkey" GROUP BY "c_custkey") lineitem becomes lineitem12,and it will become a different number every time it runs。 thank you ! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6462) VolcanoPlanner internal valid may throw exception when log trace is enabled
xiong duan created CALCITE-6462: --- Summary: VolcanoPlanner internal valid may throw exception when log trace is enabled Key: CALCITE-6462 URL: https://issues.apache.org/jira/browse/CALCITE-6462 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: xiong duan Unit test in JdbcAdapterTest: {code:java} @Test void testAioob_5() { CalciteAssert.model(JdbcTest.SCOTT_MODEL) .query("select *\n" + "from scott.emp e left join scott.dept d\n" + "on 'job' in (select job from scott.bonus b)") .runs(); }{code} Exceptions: {code:java} RelSubset [rel#108:RelSubset#7.ENUMERABLE.[]] has wrong best cost {1254.5 rows, 340.5 cpu, 0.0 io}. Correct cost is {1752.0 rows, 337.5 cpu, 0.0 io}{code} The best RelNode and the bestCode is different. When log trace is enabled, Tte code will be run: {code:java} if (LOGGER.isDebugEnabled()) { assert isValid(Litmus.THROW); }{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6461) Metadata provider for determining if columns originate from aggregate functions
Stamatis Zampetakis created CALCITE-6461: Summary: Metadata provider for determining if columns originate from aggregate functions Key: CALCITE-6461 URL: https://issues.apache.org/jira/browse/CALCITE-6461 Project: Calcite Issue Type: Improvement Components: core Reporter: Stamatis Zampetakis Assignee: Stamatis Zampetakis Add a new metadata provider for determining whether a set of columns originates from aggregate functions. In Apache Hive, we have a use-case where we need to determine if the columns in the result of a query originate from aggregate functions. Along with other things, it is used for deciding when to materialize common table expressions. {code:sql} SELECT job, SUM(sal) as total FROM emp GROUP BY job {code} For the query above, the new metadata provider should return: * false when the input is column 0 (job) * true when the input is column 1 (total) * false when the input is columns {0,1} The proposed metadata provider presents some similarities with {{ColumnOrigin}} and {{ExpressionLineage}} but the latter cannot be easily extended to provide this information. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6460) SortRemoveConstantKeysRule fails with AssertionError due to mismatched collation on resulting Sort
Ruben Q L created CALCITE-6460: -- Summary: SortRemoveConstantKeysRule fails with AssertionError due to mismatched collation on resulting Sort Key: CALCITE-6460 URL: https://issues.apache.org/jira/browse/CALCITE-6460 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Ruben Q L Assignee: Ruben Q L Fix For: 1.38.0 When {{SortRemoveConstantKeysRule}} is applied, at the end of the process it creates the resulting Sort with the new collation (which has removed the constant keys from the original one): {code} final Sort result = sort.copy(sort.getTraitSet(), input, RelCollations.of(collationsList)); {code} However, if there's a collation defined inside's the Sort's traitSet, this copy will fail, because there will be a mismatch between the (original) collation in the traitSet (which is unchanged at the moment of the copy) and the new collation that has been computed by the rule (removing constant keys): {noformat} traits=NONE.[1, 2 DESC], collation=[2 DESC] java.lang.AssertionError: traits=NONE.[1, 2 DESC], collation=[2 DESC] at org.apache.calcite.rel.core.Sort.(Sort.java:87) at org.apache.calcite.rel.logical.LogicalSort.(LogicalSort.java:48) at org.apache.calcite.rel.logical.LogicalSort.copy(LogicalSort.java:81) at org.apache.calcite.rel.core.Sort.copy(Sort.java:150) at org.apache.calcite.rel.rules.SortRemoveConstantKeysRule.onMatch(SortRemoveConstantKeysRule.java:85) ... {noformat} This problem only happens if the traitSet includes {{RelCollationTraitDef.INSTANCE}}, so it can be unnoticed in many cases. However, we can reproduce it by adjusting {{RelOptRulesTest#testSortRemovalOneKeyConstant}}: {code} sql(sql) .withVolcanoPlanner(false, p -> { p.addRelTraitDef(RelCollationTraitDef.INSTANCE); RelOptUtil.registerDefaultRules(p, false, false); }) .withRule(CoreRules.SORT_REMOVE_CONSTANT_KEYS) .check(); {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6459) Measures inside structured types (ROW, ARRAY, MAP)
Julian Hyde created CALCITE-6459: Summary: Measures inside structured types (ROW, ARRAY, MAP) Key: CALCITE-6459 URL: https://issues.apache.org/jira/browse/CALCITE-6459 Project: Calcite Issue Type: Improvement Reporter: Julian Hyde Decide whether it is valid for measures to occur inside structured types (ROW, ARRAY, MAP). If invalid, throw a validation error; if valid, change the planner to make such queries work. A measure can occur within a ROW (struct) type because measures are part of a table's row type. It is more difficult to see measures in arrays or maps, because they would have to be treated as values (similar to lambdas) and could not be expanded at compile time. Note that {{SqlTypeUtil.fromMeasure}} only handles struct types currently. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6458) RexPermuteInputsShuttle produces ArrayIndexOutOfBoundsException for join conditions with subselect
Corvin Kuebler created CALCITE-6458: --- Summary: RexPermuteInputsShuttle produces ArrayIndexOutOfBoundsException for join conditions with subselect Key: CALCITE-6458 URL: https://issues.apache.org/jira/browse/CALCITE-6458 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Corvin Kuebler Hey! Running the following unit test in JdbcAdapterTest fails with {code:java} @Test void testAioob() { CalciteAssert.model(JdbcTest.SCOTT_MODEL) .query("select job\n" + "from scott.emp e left join scott.dept d\n" + "on e.deptno = d.deptno and e.job not in (select distinct job from scott.bonus b)") .runs(); } {code} {code:java} java.lang.ArrayIndexOutOfBoundsException: Index 5 out of bounds for length 5 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6457) The array_contains function return false when arrayComponentType and op1 type are differen
Caican Cai created CALCITE-6457: --- Summary: The array_contains function return false when arrayComponentType and op1 type are differen Key: CALCITE-6457 URL: https://issues.apache.org/jira/browse/CALCITE-6457 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Caican Cai In Spark, array_contains(array(1.0, 2), 1) returns true because Spark converts array and op1 types to the biggesttype. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6456) The Mysql log function rule LOG(x, base) is equivalent to LOG(base)/LOG(x)
Caican Cai created CALCITE-6456: --- Summary: The Mysql log function rule LOG(x, base) is equivalent to LOG(base)/LOG(x) Key: CALCITE-6456 URL: https://issues.apache.org/jira/browse/CALCITE-6456 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Caican Cai Fix For: 1.38.0 The fact should be that the log function in mysql should be equivalent to LOG(x, base) is equivalent to LOG(x)/LOG(base), and the calculation is not the same as bigquery -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6455) Query model table view name and model lattice name case is different
Hugh Pearse created CALCITE-6455: Summary: Query model table view name and model lattice name case is different Key: CALCITE-6455 URL: https://issues.apache.org/jira/browse/CALCITE-6455 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.36.0 Reporter: Hugh Pearse I have noticed an inconsistency for the following ==[START model]== { "version": "1.0", "defaultSchema": "foodmart", "schemas": [ { "name": "foodmart", "type": "custom", "factory": "example.Factory", "operand": { "tables": [ { "file": "product_class.json" }, { "file": "product.json" }, { "file": "sales_fact_1997.json" }, { "file": "time_by_day.json" } ] }, "lattices": [ { "name": "star", "sql": [ "select 1 from \"foodmart\".\"sales_fact_1997\" as \"s\"", "join \"foodmart\".\"product\" as \"p\" using (\"product_id\")", "join \"foodmart\".\"time_by_day\" as \"t\" using (\"time_id\")", "join \"foodmart\".\"product_class\" as \"pc\" on \"p\".\"product_class_id\" = \"pc\".\"product_class_id\"" ], "auto": false, "algorithm": true, "rowCountEstimate": 86837, "defaultMeasures": [ { "agg": "count" } ], "tiles": [ { "dimensions": [ "the_year", [ "t", "quarter" ] ], "measures": [ { "agg": "sum", "args": "unit_sales" }, { "agg": "sum", "args": "store_sales" }, { "agg": "count" } ] } ] } ], "tables": [ { "name": "v_sales_fact_1997", "type": "view", "sql": "select product_id, customer_id from foodmart.sales_fact_1997 where product_id=123", "modifiable": false } ] } ] } ==[END model]== ==[START happy path]== FrameworkConfig frameworkConfig = Frameworks.newConfigBuilder() .parserConfig( SqlParser.config() .withCaseSensitive(false) .withUnquotedCasing(Casing.UNCHANGED) .withLex(Lex.JAVA) ) .defaultSchema(connection.getRootSchema().getSubSchema(connection.getSchema())) .build(); Planner planner = Frameworks.getPlanner(frameworkConfig); // THIS WORKS // ===> String sqlInput = "select customer_id, product_name from foodmart.star"; System.out.println("\nInput SQL is: \n"+sqlInput); SqlNode sqlNodeInput = planner.parse(sqlInput); SqlNode sqlNodeInputValidated = planner.validate(sqlNodeInput); ==[END happy path]== ==[START unhappy path]== FrameworkConfig frameworkConfig = Frameworks.newConfigBuilder() .parserConfig( SqlParser.config() .withCaseSensitive(false) .withUnquotedCasing(Casing.UNCHANGED) .withLex(Lex.JAVA) ) .defaultSchema(connection.getRootSchema().getSubSchema(connection.getSchema())) .build(); Planner planner = Frameworks.getPlanner(frameworkConfig); // THIS FAILS // ===> String sqlInput = "select customer_id, product_name from foodmart.v_sales_fact_1997"; System.out.println("\nInput SQL is: \n"+sqlInput); SqlNode sqlNodeInput = planner.parse(sqlInput); SqlNode sqlNodeInputValidated = planner.validate(sqlNodeInput); ==[END unhappy path]== The error is as follows: Exception in thread "main" org.apache.calcite.tools.ValidationException: org.apache.calcite.runtime.CalciteContextException: From line 1, column 37 to line 1, column 60: Object 'FOODMART' not found; did you mean 'foodmart'? Why does parsing a query for a view get converted to uppercase, while parsing a query for a lattice does not get converted to uppercase? I believe this is a mistake as queries for both entities should be treated the same. When the model is parsed the entity names should be normalised to a common reference format, and when queries are parsed and applied against a data model, the query should be normalised to the same reference format. Somehow this is not happening, either in model parsing or query parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6454) Implement array comparison operators
Norman Jordan created CALCITE-6454: -- Summary: Implement array comparison operators Key: CALCITE-6454 URL: https://issues.apache.org/jira/browse/CALCITE-6454 Project: Calcite Issue Type: Improvement Components: core Reporter: Norman Jordan The comparison operators <, <=, >, >=, =, <> are not implemented for arrays. Here is an example query: {code:java} SELECT array[2, 2] > array[1, 1]; {code} [This page|https://popsql.com/learn-sql/postgresql/how-to-compare-arrays-in-postgresql] describes how the comparisons work in PostgreSQL. Check if the comparison operators for arrays exist in other DB engines. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6453) Simplify casts which are result of constant reduction
Krisztian Kasa created CALCITE-6453: --- Summary: Simplify casts which are result of constant reduction Key: CALCITE-6453 URL: https://issues.apache.org/jira/browse/CALCITE-6453 Project: Calcite Issue Type: Improvement Reporter: Krisztian Kasa Assignee: Krisztian Kasa RexSimplify transforms nullable cast expressions with a non-nullable literal operand which type is different from the cast type to another cast expression with a literal operand with the same type as the cast. If simplify is called again using the result cast expression of the first simplification we get a literal only as a result. Example: Initial expression {code:java} CAST(_UTF-16LE'2020-10-30':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"):DATE {code} where the operand is non-nullable varchar and the result of the cast is nullable date. After the first simplify call we get {code:java} CAST(2020-10-30:DATE):DATE {code} where the operand type is non-nullable date and the cast type is nullable date. https://github.com/apache/calcite/blob/8ab0b03326730aa2cc6b476b2cbd8f99799bdacb/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L2269-L2276 After the second simplify call we get {code:java} 2020-10-30:DATE {code} where the literal type is non-nullable date https://github.com/apache/calcite/blob/8ab0b03326730aa2cc6b476b2cbd8f99799bdacb/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L2195-L2196 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6452) Scalar sub-query that uses IS NOT DISTINCT FROM returns incorrect result
Julian Hyde created CALCITE-6452: Summary: Scalar sub-query that uses IS NOT DISTINCT FROM returns incorrect result Key: CALCITE-6452 URL: https://issues.apache.org/jira/browse/CALCITE-6452 Project: Calcite Issue Type: Improvement Reporter: Julian Hyde Scalar sub-query that uses an {{IS NOT DISTINCT FROM}} condition returns incorrect result. For example, {code:java} select e.ename, (select count(*) from emp as f where f.comm is not distinct from e.comm) as c from emp as e{code} returns {noformat} +++ | ENAME | C | +++ | ADAMS | 0 | | ALLEN | 1 | | BLAKE | 0 | | CLARK | 0 | | FORD | 0 | | JAMES | 0 | | JONES | 0 | | KING | 0 | | MARTIN | 1 | | MILLER | 0 | | SCOTT | 0 | | SMITH | 0 | | TURNER | 1 | | WARD | 1 | +++ (14 rows) {noformat} but should return {noformat} +++ | ENAME | C | +++ | ADAMS | 0 | | ALLEN | 10 | | BLAKE | 0 | | CLARK | 0 | | FORD | 0 | | JAMES | 0 | | JONES | 0 | | KING | 0 | | MARTIN | 10 | | MILLER | 0 | | SCOTT | 0 | | SMITH | 0 | | TURNER | 10 | | WARD | 10 | +++ (14 rows) {noformat} Also, and perhaps related, if we add a {{WHERE}} to the above query, like this: {code:java} select e.ename, (select count(*) from emp as f where f.comm is not distinct from e.comm) as c from emp as e where e.deptno = 10{code} Calcite throws: {noformat} Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 6 out of bounds for length 5 > at > com.google.common.collect.RegularImmutableList.get(RegularImmutableList.java:77) > at org.apache.calcite.rex.RexBuilder.makeInputRef(RexBuilder.java:1037) > at > org.apache.calcite.sql2rel.RelDecorrelator$AdjustProjectForCountAggregateRule.lambda$onMatch2$4(RelDecorrelator.java:2679) {noformat} I ran into this error while trying to rewrite queries that had measures and used {{ROLLUP}}. (Since {{ROLLUP}} will generate group keys whose value is NULL, we cannot use regular {{=}} when doing a self-join.) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6451) Improve Nullability Derivation for Intersect and Minus
Victor Barua created CALCITE-6451: - Summary: Improve Nullability Derivation for Intersect and Minus Key: CALCITE-6451 URL: https://issues.apache.org/jira/browse/CALCITE-6451 Project: Calcite Issue Type: Improvement Components: core Reporter: Victor Barua Assignee: Victor Barua SetOp overrides `deriveRowType()` and computes the output row type to be the least restrictive across all inputs [here|https://github.com/apache/calcite/blob/8ab0b03326730aa2cc6b476b2cbd8f99799bdacb/core/src/main/java/org/apache/calcite/rel/core/SetOp.java#L116-L127]. So for example given {code:java} Input 1: (I64, I64, I64?, I64?) Input 2: (I64, I64?, I64, I64?) {code} where ? denotes nullable, the least restrictive output computes: {code:java} Output: (I64, I64?, I64?, I64?) {code} For UNION operations, these nullabilities are accurate. However for MINUS and INTERSECT there is room for improvement. *MINUS* only returns rows from the first input, as such its output nullability should always match that of its first input: {code:java} Output: (I64, I64, I64?, I64?) {code} *INTERSECT* only returns rows that match across all inputs. If a column is not nullable in any of the inputs, then it is not nullable in the output because no rows can be emitted in which that column is null: {code:java} Output: (I64, I64, I64, I64?) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6450) Postgres CONCAT_WS function throws exception when parameter type is (, )
xiong duan created CALCITE-6450: --- Summary: Postgres CONCAT_WS function throws exception when parameter type is (, ) Key: CALCITE-6450 URL: https://issues.apache.org/jira/browse/CALCITE-6450 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: xiong duan The SQL can run success in Postgres: {code:java} select concat_ws(',',ARRAY[1, 1, 1, 1]); {code} But in Calcite, It will throw exception: {code:java} @Test void testConcatFunction() { final String sql = "select concat_ws(',',ARRAY[1, 1, 1, 1]) as c"; fixture() .withFactory(c -> c.withOperatorTable(t -> SqlValidatorTest.operatorTableFor(SqlLibrary.POSTGRESQL))) .withCatalogReader(MockCatalogReaderExtended::create) .withSql(sql) .ok(); }{code} {code:java} >From line 1, column 8 to line 1, column 55: Cannot apply 'CONCAT_WS' to >arguments of type 'CONCAT_WS(, )'. Supported form(s): >'CONCAT_WS()'{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6449) Enable PostgreSQL implementations of to_date/to_timestamp
Norman Jordan created CALCITE-6449: -- Summary: Enable PostgreSQL implementations of to_date/to_timestamp Key: CALCITE-6449 URL: https://issues.apache.org/jira/browse/CALCITE-6449 Project: Calcite Issue Type: Sub-task Components: core Reporter: Norman Jordan Update the PostgreSQL function library to use PostgreSQL specific implementations for *TO_DATE* and {*}TO_TIMESTAMP{*}. These implementations should fully support the date/time formatting patterns that PostgreSQL supports. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6448) FilterReduceExpressionsRule returns empty RelNode for RexLiterals
Soumyakanti Das created CALCITE-6448: Summary: FilterReduceExpressionsRule returns empty RelNode for RexLiterals Key: CALCITE-6448 URL: https://issues.apache.org/jira/browse/CALCITE-6448 Project: Calcite Issue Type: Bug Components: core Reporter: Soumyakanti Das In Hive, for simple queries with {{WHERE 'foo'}} we get an empty result, where as for simple queries with {{WHERE t.stringColumn}} we get non-empty result. Currently in {{FilterReduceExpressionsRule}}, all {{RexLiteral}}s are converted to empty values by {code:java} else if (newConditionExp instanceof RexLiteral || RexUtil.isNullLiteral(newConditionExp, true)) { call.transformTo(createEmptyRelOrEquivalent(call, filter)); {code} This might be not inline with {{RexLiteral#isAlwaysFalse}}, and it might be safer to leave the literal as it is. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6447) extract common expressions for disjunctions in Join
ruanhui created CALCITE-6447: Summary: extract common expressions for disjunctions in Join Key: CALCITE-6447 URL: https://issues.apache.org/jira/browse/CALCITE-6447 Project: Calcite Issue Type: New Feature Components: core Affects Versions: 1.37.0 Reporter: ruanhui Fix For: 1.37.0 For SQL: {code:java} select * from tbl_a join tbl_b on tbl_a.id = tbl_b.id where (tbl_a.x > 100 and tbl_b.y < 10) or (tbl_a.x > 100 and tbl_b.z > 20){code} we can rewrite it to {code:java} select * from tbl_a join tbl_b on tbl_a.id = tbl_b.id where tbl_a.x > 100 and (tbl_b.y < 10 or tbl_b.z > 20){code} And in this way *tbl_a.x > 100* can be pushed down and it is likely that this will help reduce the amount of data involved in the join. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6446) Add CONCAT_WS function (enabled in Spark library)
EveyWu created CALCITE-6446: Summary: Add CONCAT_WS function (enabled in Spark library) Key: CALCITE-6446 URL: https://issues.apache.org/jira/browse/CALCITE-6446 Project: Calcite Issue Type: Improvement Reporter: EveyWu -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6445) Add REVERSE function (enabled in Spark library)
EveyWu created CALCITE-6445: Summary: Add REVERSE function (enabled in Spark library) Key: CALCITE-6445 URL: https://issues.apache.org/jira/browse/CALCITE-6445 Project: Calcite Issue Type: Improvement Reporter: EveyWu Add Spark functions that have been implemented but have different OperandTypes/Returns. Add Function [REVERSE(str|array)|https://spark.apache.org/docs/latest/api/sql/#reverse] The parameter type supports not only the string type but also the array type. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6444) Add a RedShift SqlLibrary
Norman Jordan created CALCITE-6444: -- Summary: Add a RedShift SqlLibrary Key: CALCITE-6444 URL: https://issues.apache.org/jira/browse/CALCITE-6444 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.34.0 Reporter: Norman Jordan Assignee: Norman Jordan The Babel tests assume that RedShift is a union of the Oracle and PostgreSQL libraries. This is not accurate. There are functions such as *INCR* that are supported by Oracle and not by RedShift. Problems will also occur when Oracle and Postgres both support a function of the same name that needs different implementations. [https://github.com/apache/calcite/blob/main/babel/src/test/java/org/apache/calcite/test/BabelQuidemTest.java#L110] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6443) Create view based on LookML model
Julian Hyde created CALCITE-6443: Summary: Create view based on LookML model Key: CALCITE-6443 URL: https://issues.apache.org/jira/browse/CALCITE-6443 Project: Calcite Issue Type: Improvement Reporter: Julian Hyde Create a view based on LookML model (more specifically, based on an explore in a LookML model). Note: I have no plans to implement this in open source, but it's helpful to set out the requirements. For example, the following statement using the LOOKML_VIEW table function {code} CREATE VIEW OrdersCube AS SELECT * FROM TABLE LOOKML_VIEW('orders', ' model: model_name { view: orders { sql: SELECT * FROM orders ;; dimension: id { primary_key: yes type: number } dimension_group: created { type: time timeframes: [time, date, week, month] sql: created_at;; } dimension: amount { type: number value_format: “0.00” } measure: count measure: total_amount { type: sum sql: amount ;; } } view: customers { sql_table_name: customers label: “Customer” dimension: id { primary_key: yes } dimension: zipcode dimension: state } explore: orders { from: orders join: customers { sql_on: customers.id = orders.customer_id ;; } } }'); {code} is equivalent to the following: {code} CREATE VIEW OrdersCube AS SELECT * FROM ( SELECT `orders.id`, // PK `orders.customer_id`, TIME(created_at) AS `orders.created_time`, DATE(created_at) AS `orders.created_date`, WEEK(created_at) AS `orders.created_week`, MONTH(created_at) AS `orders.created_month`, amount AS `orders.amount`, // value_format: “0.00” COUNT(*) AS MEASURE `orders.count`, SUM(amount) AS MEASURE `orders.total_amount` FROM orders) AS orders JOIN ( SELECT id AS `customers.id`, // PK zip_code AS `customers.zip_code`, state AS `customers.state` FROM customers) AS customers // label: “Customer” ON `customers.id` = `orders.customer_id`; {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6442) Validator rejects FILTER in OVER windows
Mihai Budiu created CALCITE-6442: Summary: Validator rejects FILTER in OVER windows Key: CALCITE-6442 URL: https://issues.apache.org/jira/browse/CALCITE-6442 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Mihai Budiu The Calcite grammar about windows says this: {code} windowedAggregateCall: agg '(' [ ALL | DISTINCT ] value [, value ]* ')' [ RESPECT NULLS | IGNORE NULLS ] [ WITHIN GROUP '(' ORDER BY orderItem [, orderItem ]* ')' ] [ FILTER '(' WHERE condition ')' ] OVER window {code} However, the validator rejects the following query: {code:sql} SELECT deptno, COUNT(DISTINCT deptno) FILTER (WHERE deptno > 10) OVER win AS agg FROM emp WINDOW win AS (PARTITION BY empno) {code} with the following error: {code} org.apache.calcite.sql.validate.SqlValidatorException: OVER must be applied to aggregate function at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:67) at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500) at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:484) at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:507) at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:601) at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:948) at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:933) at org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:5676) at org.apache.calcite.sql.SqlOverOperator.validateCall(SqlOverOperator.java:77) at org.apache.calcite.sql.validate.SqlValidatorImpl.validateCall(SqlValidatorImpl.java:6404) {code} The easy fix would be to remove this from the grammar. A harder fix would be to implement support for FILTERs in windows, but I don't know how hard that would be. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6441) Add BOOLAGG_AND, BOOLAGG_OR aggregate functions (enabled in Snowflake library)
Julian Hyde created CALCITE-6441: Summary: Add BOOLAGG_AND, BOOLAGG_OR aggregate functions (enabled in Snowflake library) Key: CALCITE-6441 URL: https://issues.apache.org/jira/browse/CALCITE-6441 Project: Calcite Issue Type: Improvement Reporter: Julian Hyde Add BOOLAGG_AND, BOOLAGG_OR aggregate functions. These functions are the Snowflake equivalent to the standard EVERY and SOME aggregate functions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6440) Add optimization to returns empty when The Order By parameter in the Sort operator is empty
Caican Cai created CALCITE-6440: --- Summary: Add optimization to returns empty when The Order By parameter in the Sort operator is empty Key: CALCITE-6440 URL: https://issues.apache.org/jira/browse/CALCITE-6440 Project: Calcite Issue Type: New Feature Components: core Affects Versions: 1.37.0 Reporter: Caican Cai Fix For: 1.38.0 For the following SQL: {code:java} select * from task order by null; {code} We could rewrite it to: {code:java} select * from task;{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6439) FILTER_INSTANCE in RelOptRule does not take effect
Caican Cai created CALCITE-6439: --- Summary: FILTER_INSTANCE in RelOptRule does not take effect Key: CALCITE-6439 URL: https://issues.apache.org/jira/browse/CALCITE-6439 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Caican Cai Fix For: 1.38.0, 1.36.0 test example {code:java} @Test void testFilterNull() { final String sql = "select * from emp where null"; sql(sql).withRule(PruneEmptyRules.FILTER_INSTANCE).check(); } {code} logical plan {code:java} {code} unchanged -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6438) Add optimization to returns empty when the filter operator is a false instance
Caican Cai created CALCITE-6438: --- Summary: Add optimization to returns empty when the filter operator is a false instance Key: CALCITE-6438 URL: https://issues.apache.org/jira/browse/CALCITE-6438 Project: Calcite Issue Type: New Feature Components: core Affects Versions: 1.37.0 Reporter: Caican Cai Fix For: 1.38.0 {code:java} select * from emp where false{code} The current logical plan in calcite is {code:java} LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8]) LogicalFilter(condition=[false]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} It can be optimized to {code:java} LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8]) LogicalValues(tuples=[[]]) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6437) For druid sql JSON_OBJECT() function results in RUNTIME_FAILURE when querying INFORMATION_SCHEMA.COLUMNS
AlbericByte created CALCITE-6437: Summary: For druid sql JSON_OBJECT() function results in RUNTIME_FAILURE when querying INFORMATION_SCHEMA.COLUMNS Key: CALCITE-6437 URL: https://issues.apache.org/jira/browse/CALCITE-6437 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.35.0 Reporter: AlbericByte there is but for druid sql : [https://github.com/apache/druid/issues/16356] h3. Description Running the following query {code:java} select JSON_OBJECT('name': COLUMN_NAME, 'type': DATA_TYPE) from INFORMATION_SCHEMA.COLUMNS{code} {{ }} produces {{}} {code:java} Error: RUNTIME_FAILURE (OPERATOR) cannot translate call json_object($t17, $t18, $t3, $t19, $t7) java.lang.RuntimeException {code} Note that the error message quotes the json_object call to receive five parameters, but maybe that's an unrelated internal detail. For comparison, the following works (but is useless): {code:java} select JSON_OBJECT('name': 'foo', 'type': 'VARCHAR'){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6436) Missing round brackets in WHERE condition which contains calculations
Ulrich Kramer created CALCITE-6436: -- Summary: Missing round brackets in WHERE condition which contains calculations Key: CALCITE-6436 URL: https://issues.apache.org/jira/browse/CALCITE-6436 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: Ulrich Kramer Running the following unit test in {{JdbcAdapterTest}} fails with {code:java} @Test void testOperatorInWhere() { CalciteAssert.model(FoodmartSchema.FOODMART_MODEL) .query("with A as (select (\"product_id\" = 1) AS abc from \"sales_fact_1997\") " + "select * from A " + "where abc = ?") .consumesPreparedStatement(p -> p.setBoolean(1, true)) .runs(); } {code} {noformat} Caused by: java.lang.RuntimeException: While executing SQL [SELECT "product_id" = 1 AS "ABC" FROM "foodmart"."sales_fact_1997" WHERE "product_id" = 1 = ?] on JDBC sub-schema at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:67) {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6435) SqlToRel conversion of IN expressions may lead to incorrect simplifications
Zoltan Haindrich created CALCITE-6435: - Summary: SqlToRel conversion of IN expressions may lead to incorrect simplifications Key: CALCITE-6435 URL: https://issues.apache.org/jira/browse/CALCITE-6435 Project: Calcite Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich the query must have the following features: * not all columns are selected ** to enable `RelFieldTrimmer` to start a cycle * two equivalent eq filters ** one in `IN` form (`ename in ( 'Sebastian' )`) ** a regular `=` (`ename = 'Sebastian'`) * an unrelated filter like `deptno < 100` the optimizer should more-or-less start with the `RelFieldTrimmer` the issue happens like: * at parse time both literals are parsed as `CHAR(n)` * the number of values in the `IN` is below `inSubqueryThreshold` - so it gets converted to a set of `=` filters ** expression is converted to OR form ** during conversion [SqlToRelConverter#ensureSqlType|#ensureSqlType]([https://github.com/apache/calcite/blob/fb15511e76c660cbd440578421645ebe63941bf7/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L1777-L1779]) is called *** which skips the conversion for `CHAR` / `VARCHAR` * the `=` filter goes thru the "regular" rex conversion - which involves calling `rexBuilder#ensureType` * the filter condition contains `ename = 'Sebastian'` twice; however the types differ * `RelFieldTrimmer` start a change cycle ; which induces the simplification of the filter condition * `RexSimplify` is executed with predicate elimination disabled (this will be important) * simplification compares the two literals with [equals]([https://github.com/apache/calcite/blob/fb15511e76c660cbd440578421645ebe63941bf7/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1685]) and returns `false` workarounds: * disable the conversion by setting `inSubqueryThreshold` to `1` * run a rule which executes `RexSimplify` with predicate elimination enabled earlier than the trimmer (ex: `ReduceExpressionsRule`) ** I think this bug remained hidden because this might happen easily testcase for `RelOptRulesTest` {code:java} @Test void testIncorrectInType() { final String sql = "select ename from emp " + " where ename in ( 'Sebastian' ) and ename = 'Sebastian' and deptno < 100"; sql(sql) .withTrim(true) .withRule() .checkUnchanged(); } {code} results in plan {code:java} LogicalProject(ENAME=[$0]) LogicalValues(tuples=[[]]) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6434) Specify identifier quoting for HiveSqlDialect and SparkSqlDialect
xiong duan created CALCITE-6434: --- Summary: Specify identifier quoting for HiveSqlDialect and SparkSqlDialect Key: CALCITE-6434 URL: https://issues.apache.org/jira/browse/CALCITE-6434 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: xiong duan Assignee: xiong duan Fix For: 1.38.0 The SQL: {code:java} SELECT product.product_class_id C FROM foodmart.product LEFT JOIN (SELECT CASE COUNT(*) WHEN 0 THEN NULL WHEN 1 THEN MIN(product_class_id) ELSE (SELECT NULL UNION ALL SELECT NULL) END $f0 FROM foodmart.product) t0 ON TRUE WHERE product.net_weight > t0.$f0{code} Generate by SINGLE_VALUE agg function. This SQL will parse failed in Spark Unless we add the identifier quoting like `t0`.`$f0` -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6433) SUBSTRING can return incorrect empty result for some parameters
Iurii Gerzhedovich created CALCITE-6433: --- Summary: SUBSTRING can return incorrect empty result for some parameters Key: CALCITE-6433 URL: https://issues.apache.org/jira/browse/CALCITE-6433 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: Iurii Gerzhedovich SUBSTRING function for cases when 3rd parameter (length) more than Integer.MAX_VALUE can return empty result due to code do clamp that value and after that it can't be more than Integer.MAX_VALUE. Simple way to reproduce : append into *SqlOperatorTest* smth like: {noformat} f.checkScalar( String.format("{fn SUBSTRING('abcdef', %d, %d)}", Integer.MIN_VALUE, 10L + Integer.MAX_VALUE), "abcdef", "VARCHAR(6) NOT NULL"); {noformat} it`s all due to check after clamping {noformat} public static String substring(String c, int s, int l) { long e = (long) s + (long) l; -- here we can got incorrect length . if (s > lc || e < 1L) { return ""; } -{noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6432) Infinite loop for JoinPushTransitivePredicatesRule when there are multiple project expressions reference the same input field
asdfgh19 created CALCITE-6432: - Summary: Infinite loop for JoinPushTransitivePredicatesRule when there are multiple project expressions reference the same input field Key: CALCITE-6432 URL: https://issues.apache.org/jira/browse/CALCITE-6432 Project: Calcite Issue Type: Improvement Components: core Reporter: asdfgh19 Assignee: asdfgh19 Attachments: 1.png, 2.png Below is a newly created test case that is stuck in an infinite loop. {code:java} @Test void testProjectPredicatePull() { final String sql = "select e.ename, d.dname\n" + "from (select ename, deptno from emp where deptno = 10) e\n" + "join (select name dname, deptno, * from dept) d\n" + "on e.deptno = d.deptno"; final HepProgram program = new HepProgramBuilder() .addRuleCollection( ImmutableList.of(CoreRules.FILTER_PROJECT_TRANSPOSE, CoreRules.JOIN_PUSH_TRANSITIVE_PREDICATES)) .build(); sql(sql).withProgram(program).check(); } {code} !1.png|width=563,height=192! >From the figure above, we can see the process of this infinite loop: # JOIN_PUSH_TRANSITIVE_PREDICATES infers DEPTNO = 10 from the left side of the Join # JOIN_PUSH_TRANSITIVE_PREDICATES creates a new Filter DEPTNO = 10 on the right side of the Join # FILTER_PROJECT_TRANSPOSE pushes this newly created Filter past the Project # JOIN_PUSH_TRANSITIVE_PREDICATES infers DEPTNO = 10 from the left side of the Join and infers {color:#FF}DEPTNO0 = 10{color} from the right side of the Join,Note that it is DEPTNO0, not DEPTNO # So, JOIN_PUSH_TRANSITIVE_PREDICATES creates a new Filter DEPTNO = 10 and put it to the right side of the Join again. And then the process keeps repeating. The reason is that when RelMdPredicates infers predicates for a project, and the project has multiple expressions referencing the same input field, for example, DEPT=$0 DEPT0=$0, and the input predicate is $0=10, the inferred result is DEPT0=10, which means that the predicates related to other expressions with the same input field (DEPT=10) are lost, leaving only the last one. Below is the test case from RelOptRulesTest and its planAfter. {code:java} @Test void testJoinPushTransitivePredicatesRule2() { final String sql = "select n1.SAL\n" + "from EMPNULLABLES_20 n1\n" + "where n1.SAL IN (\n" + " select n2.SAL\n" + " from EMPNULLABLES_20 n2\n" + " where n1.SAL = n2.SAL or n1.SAL = 4)"; sql(sql).withDecorrelate(true) .withExpand(true) .withRule(CoreRules.FILTER_INTO_JOIN, CoreRules.JOIN_CONDITION_PUSH, CoreRules.JOIN_PUSH_TRANSITIVE_PREDICATES) .check(); }{code} !2.png|width=563,height=175! In the case where the left side of the Join already has an identical predicate (>($5, 1000)), JOIN_PUSH_TRANSITIVE_PREDICATES infers a new predicate (LogicalFilter(condition=[>($5, 1000)])) from the right side of the Join and puts it on the left side of the Join. This is because the Project on the left side of the Join has SAL=[$5], SAL0=[$5], and JOIN_PUSH_TRANSITIVE_PREDICATES fails to find this predicate, so it creates a new one. If we add a new rule CoreRules.FILTER_PROJECT_TRANSPOSE to this test case and modify it slightly, it will also fall into an infinite loop. {code:java} @Test void testJoinPushTransitivePredicatesRule2() { final String sql = "select n1.SAL\n" + "from EMPNULLABLES_20 n1\n" + "where n1.SAL IN (\n" + " select n2.SAL\n" + " from EMPNULLABLES_20 n2\n" + " where n1.SAL = n2.SAL or n1.SAL = 4)"; final HepProgram program = new HepProgramBuilder() .addRuleCollection( ImmutableList.of(CoreRules.FILTER_INTO_JOIN, CoreRules.FILTER_PROJECT_TRANSPOSE, CoreRules.JOIN_CONDITION_PUSH, CoreRules.JOIN_PUSH_TRANSITIVE_PREDICATES)) .build(); sql(sql).withDecorrelate(true) .withExpand(true) .withProgram(program) .check(); } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6431) Implement the SINGLE_VALUE aggregation in HiveSqlDialect And SparkSQLDialect
xiong duan created CALCITE-6431: --- Summary: Implement the SINGLE_VALUE aggregation in HiveSqlDialect And SparkSQLDialect Key: CALCITE-6431 URL: https://issues.apache.org/jira/browse/CALCITE-6431 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: xiong duan Assignee: xiong duan Fix For: 1.38.0 When we use the SQL including the scalar query, converting this SQL Rel to HiveSQL will include the SINGLE_VALUE aggregation function(HiveSQL can't handle it) in the dialect SQL. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6430) SINGLE_VALUE rewrite to wrong sql when the sub-query return one not-null value and NULL value in PostgreSQL、MySQL、HSQL dialect
xiong duan created CALCITE-6430: --- Summary: SINGLE_VALUE rewrite to wrong sql when the sub-query return one not-null value and NULL value in PostgreSQL、MySQL、HSQL dialect Key: CALCITE-6430 URL: https://issues.apache.org/jira/browse/CALCITE-6430 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: xiong duan Assignee: xiong duan Fix For: 1.38.0 Calcite rewrite the SINGLE_VALUE to different sql : For example in HSQL: {code:java} CASE COUNT(*) WHEN 0 THEN NULL WHEN 1 THEN MIN() ELSE (VALUES 1 UNION ALL VALUES 1) END{code} This is right. But Calcite will generate: {code:java} CASE COUNT(result) WHEN 0 THEN NULL WHEN 1 THEN MIN() ELSE (VALUES 1 UNION ALL VALUES 1) END{code} This sql will return wrong result. For Example: tableA: ||c1||c2|| |4|1| |NULL|NULL| |NULL|NULL| |NULL|NULL| TheSQL: {code:java} select * from tableA where c1 > (select c2 from tableA);{code} will throw : [21000][1242] Subquery returns more than 1 row But SQL: {code:java} select c1 as column1C1 from column3 left join (select case count(c2) when 0 then null when 1 then min(c2) else (select cast(null as integer) union all select cast(null as integer)) end as alias from column3 ) as t1 on true where column3.c1 > t1.alias;{code} will return one row value. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6429) Arrow adapter should default to the Enumerable convention for unsupported features
Alessandro Solimando created CALCITE-6429: - Summary: Arrow adapter should default to the Enumerable convention for unsupported features Key: CALCITE-6429 URL: https://issues.apache.org/jira/browse/CALCITE-6429 Project: Calcite Issue Type: Sub-task Affects Versions: 1.37.0 Reporter: Alessandro Solimando Assignee: Alessandro Solimando Fix For: 1.38.0 Currently, the adapter fails with different errors when one of the known unsupported features is used, while it might simply catch the "UnsupportedOperationException" and bail out, thus allowing to generate a plan where the unsupported operation is implemented in the Enumerable convention. For instance, currently several tests are disabled for this reason, like [testArrowProjectFieldsWithDisjunctiveFilter|https://github.com/apache/calcite/blob/be9b860ebf3143dcbdbd0ee1777e604f0ace7b3c/arrow/src/test/java/org/apache/calcite/adapter/arrow/ArrowAdapterTest.java#L242], because it would fail as follows: {code:java} Error while executing SQL "select "intField", "stringField" from arrowdata where "intField"=12 or "stringField"='12'": Error while applying rule ArrowFilterRule, args [rel#31:LogicalFilter.NONE.[](input=RelSubset#15,condition=OR(=($0, 12), =($1, '12'))), rel#1:ArrowTableScan.ARROW.[](table=[ARROW, ARROWDATA],fields=[0, 1, 2, 3])] java.sql.SQLException: Error while executing SQL "select "intField", "stringField" from arrowdata where "intField"=12 or "stringField"='12'": Error while applying rule ArrowFilterRule, args [rel#31:LogicalFilter.NONE.[](input=RelSubset#15,condition=OR(=($0, 12), =($1, '12'))), rel#1:ArrowTableScan.ARROW.[](table=[ARROW, ARROWDATA],fields=[0, 1, 2, 3])] at org.apache.calcite.avatica.Helper.createException(Helper.java:56) at org.apache.calcite.avatica.Helper.createException(Helper.java:41) at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:164) at org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:228) at org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:566) at org.apache.calcite.test.CalciteAssert$AssertQuery.lambda$returns$1(CalciteAssert.java:1495) at org.apache.calcite.test.CalciteAssert$AssertQuery.withConnection(CalciteAssert.java:1434) at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1493) at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1483) at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1446) at org.apache.calcite.adapter.arrow.ArrowAdapterTest.testArrowProjectFieldsWithDisjunctiveFilter(ArrowAdapterTest.java:260) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727) at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60) at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131) at org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45) at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156) at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147) at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86) at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103) at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93) at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106) at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64) at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45) at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37) at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92) at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(Inte
[jira] [Created] (CALCITE-6428) Typo in adapter documentation
Thad Guidry created CALCITE-6428: Summary: Typo in adapter documentation Key: CALCITE-6428 URL: https://issues.apache.org/jira/browse/CALCITE-6428 Project: Calcite Issue Type: Improvement Reporter: Thad Guidry There's a typo `10:37` which should be `9:37` in the [Window functions|https://calcite.apache.org/docs/adapter.html#window-functions] section of `/docs/adapter.html` -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6427) Use a higher precision for DECIMAL intermediate results for some aggregate functions like STDDEV
Mihai Budiu created CALCITE-6427: Summary: Use a higher precision for DECIMAL intermediate results for some aggregate functions like STDDEV Key: CALCITE-6427 URL: https://issues.apache.org/jira/browse/CALCITE-6427 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Mihai Budiu Today aggregates like STDDEV are expanded into computations that use intermediate results with the same type as the input values. Since STDDEV squares values, this precision may prove insufficient. For DECIMAL values a reasonable approach would be to use double the input precision and scale. This is also related to [CALCITE-6322], [CALCITE-4924], [CALCITE-6324] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6426) When performing starrocks dialect conversion for cast (xxx as int) and cast (xxx as bigint), it defaults to using MySQL dialect
fanluo created CALCITE-6426: --- Summary: When performing starrocks dialect conversion for cast (xxx as int) and cast (xxx as bigint), it defaults to using MySQL dialect Key: CALCITE-6426 URL: https://issues.apache.org/jira/browse/CALCITE-6426 Project: Calcite Issue Type: Bug Affects Versions: 1.37.0 Reporter: fanluo Attachments: image-2024-06-03-19-55-59-034.png, image-2024-06-03-19-57-22-686.png When performing starrocks dialect conversion for cast (xxx as int) and cast (xxx as bigint), it defaults to using MySQL dialect,like this: !image-2024-06-03-19-57-22-686.png|width=839,height=123! while starlocks does not explicitly declare support for the signed data type. If conversion is performed according to the MySQL dialect by default, the execution result will exceed expectations when the data value exceeds the range [-21474836482147483647] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6425) Attributes
Julian Hyde created CALCITE-6425: Summary: Attributes Key: CALCITE-6425 URL: https://issues.apache.org/jira/browse/CALCITE-6425 Project: Calcite Issue Type: Improvement Reporter: Julian Hyde Add operators and metadata so that columns can have attributes. For example, we wish to associate a FORMATTED_VALUE attribute to the SALARY column. The following example defines an attribute in a CTE and then uses it in a query. {code:java} CREATE VIEW EnhancedEmp AS SELECT deptno, ename, DEFINE_ATTRIBUTE( DEFINE_ATTRIBUTE(revenue, 'FORMATTED_VALUE', CAST(revenue AS TO_CHAR(revenue, 'L99D99')), 'SORT_KEY', revenue) FROM Emp); SELECT ename, sal, GET_ATTRIBUTE(sal, 'FORMATTED_VALUE') AS formatted_sal, DESCRIBE_ATTRIBUTES(sal) AS attributes FROM EnhancedEmp WHERE ename = 'SCOTT'; ENAME SAL FORMATTED_SAL ATTRIBUTES = = == SCOTT 1200 $1200.00. FORMATTED_VALUE: VARCHAR, SORT_KEY: NUMBER SELECT deptno, SUM(sal) AS sum_sal, GET_ATTRIBUTE(SUM(sal), 'FORMATTED_VALUE') AS formatted_sum_sal FROM EnhancedEmp WHERE ename = 'SCOTT'; DEPTNO SUM_SAL FORMATTED_SUM_SAL == === = 105000 $5,000.00 203750 $3,750.00{code} Here are the functions: * {{DEFINE_ATTRIBUTE(targetExpression, attributeName, expression)}} defines an attribute on targetExpression; the return value is the same but has an extra attribute * {{GET_ATTRIBUTE(targetExpression, attributeName)}} evaluates an attribute of an expression * {{DESCRIBE_ATTRIBUTES(targetExpression)}} returns a string describing the names and types of available attributes What is the goal of this facility? To be able to attach "semantic" metadata, such as format string, formatted value, sort key, to values in a way that can easily be transmitted over JDBC. An alternative approach would have been to convert values into records. But the consuming expression and JDBC would have to deal with those records. In the proposed approach, you can find what attributes are available as part of an 'extended type' using {{DESCRIBE_ATTRIBUTES}}. Also required is a set of rules for how particular attributes propagate through queries and expressions. For example, the {{FORMATTED_VALUE}} attribute propagates through subquery ({{SELECT}}), filter ({{WHERE}}), aggregate ({{GROUP BY}}), and through the {{SUM}} aggregate function. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6424) Enable RLIKE function in MySQL library
xiong duan created CALCITE-6424: --- Summary: Enable RLIKE function in MySQL library Key: CALCITE-6424 URL: https://issues.apache.org/jira/browse/CALCITE-6424 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: xiong duan Assignee: xiong duan Fix For: 1.38.0 According to [https://sqlfiddle.com/mysql/online-compiler?id=1bca82cf-ebb9-4100-8cec-302b688f7ba8]. And the Official Document about RLIKE https://dev.mysql.com/doc/refman/8.4/en/regexp.html#operator_regexp -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6423) Invalid unparse for CHAR without precision in MySQLDialect
xiong duan created CALCITE-6423: --- Summary: Invalid unparse for CHAR without precision in MySQLDialect Key: CALCITE-6423 URL: https://issues.apache.org/jira/browse/CALCITE-6423 Project: Calcite Issue Type: Improvement Affects Versions: 1.37.0 Reporter: xiong duan Assignee: xiong duan Fix For: 1.38.0 As code review comment in [https://github.com/apache/calcite/pull/3799.] Execute SQL : {code:java} select cast(product_id as char) from product{code} Exppected MySQL SQL should be: {code:java} select cast(product_id as char) from product{code} But is: {code:java} select cast(product_id as char(1)) from product{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6422) RexLiteral.isNullLiteral should be called before RexLiteral.booleanValue in SubstitutionVisitor.mayBeSatisfiable
Mou Wu created CALCITE-6422: --- Summary: RexLiteral.isNullLiteral should be called before RexLiteral.booleanValue in SubstitutionVisitor.mayBeSatisfiable Key: CALCITE-6422 URL: https://issues.apache.org/jira/browse/CALCITE-6422 Project: Calcite Issue Type: Bug Reporter: Mou Wu -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6421) Calcite Avatica support JDK 22
Sergey Nuyanzin created CALCITE-6421: Summary: Calcite Avatica support JDK 22 Key: CALCITE-6421 URL: https://issues.apache.org/jira/browse/CALCITE-6421 Project: Calcite Issue Type: Bug Components: avatica Reporter: Sergey Nuyanzin Assignee: Sergey Nuyanzin -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6420) Fix confusing MappingType enum
ZiJie Song created CALCITE-6420: --- Summary: Fix confusing MappingType enum Key: CALCITE-6420 URL: https://issues.apache.org/jira/browse/CALCITE-6420 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: ZiJie Song Assignee: ZiJie Song Fix confusing MappingType enum I've noticed that the MappingType enumeration is quite confusing and contradicts its own definition. After some research, I found that actually, there are many bugs in the code and implementation of the entire Mapping part. Next I'll describe the error I observed. 1. Enumeration error According to the four fields OPTIONAL_SOURCE, MULTIPLE_SOURCE, OPTIONAL_TARGET, MULTIPLE_TARGET in MappingType.java and the function design in the class, I realized that the original design intention of this class is as follows: Mapping is divided into 16 types according to the corresponding relationship between source and target. For a given source, it can correspond to 1, <=1, >=1 or any target, so the mapping is divided into four categories. On the contrary, the target corresponding to the source is also divided into four categories, for a total of 16 types of mapping. But for some reason, the ordinal of MappingType is not set correctly. According to the above definition, the ordinal of the surjection should be 2, but it is set to 1. The injective ordinal should be 1, but is set to 2. Many other types also have errors, such as InverseSurjection and so on. The above error led to a funny phenomenon, the result of MappingType.Surjection.isSurjection() turned out to be false. This also proves the correctness of my observation 1. 2. Implementation errors Due to the wrong MappingType setting, the implementation in Mapping is also very confusing. Partial_Surjection and Surjection are implemented as PartialMapping, which is completely inconsistent with their literal meaning. InverseSurjection should be the inverse function of Surjection, but it is interpreted as SurjectionWithInverse. This is not just an error in the meaning of variable naming, because functions such as MappingType.inverse() are also called in the code, which will lead to actual logical errors. TODO: 1. Set the ordinal correctly according to the literal meaning of each enumeration of MappingType. 2. Correspond the implementation of Mapping to the correct MappingType 3. Check the places where Mapping is used in the entire project and set them to the correct MappingType (about a dozen places). This requires examining the logical meaning they really need (for example, when using Surjection, does the code want to use the correct Surjection or the original wrong one? ). After modify the logic, I need to check whether it will cause other changes. I'd love to fix these issues myself. I'd be very happy if someone could help me check if the above ideas are correct. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6419) Invalid unparse for VARCHAR without precision in HiveSqlDialect And SparkSqlDialect
xiong duan created CALCITE-6419: --- Summary: Invalid unparse for VARCHAR without precision in HiveSqlDialect And SparkSqlDialect Key: CALCITE-6419 URL: https://issues.apache.org/jira/browse/CALCITE-6419 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: xiong duan Assignee: xiong duan Fix For: 1.38.0 When we execute SQL in Calcite: {code:java} select cast(product_id as varchar) from product; {code} Generage the HiveSQL\SparkSQL: {code:java} select cast(product_id as varchar) from product; {code} According to the [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-VarcharvarcharVarchar]. In Hive, the varchar must have a precision.So when unpare VARCHAR without precision, I will convert VARCHAR to String. VARCHAR with precison do nothing. According to the [https://spark.apache.org/docs/latest/sql-ref-datatypes.html]. In Spark, Same as Hive. But as note, It can only be used in table schema, not functions/operators. So I will convert VARCHAR with or without precision to String; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6418) Expression with ALL aggregate return unexpected result.
Evgeny Stanilovsky created CALCITE-6418: --- Summary: Expression with ALL aggregate return unexpected result. Key: CALCITE-6418 URL: https://issues.apache.org/jira/browse/CALCITE-6418 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Evgeny Stanilovsky I expect that: {noformat} SELECT i=ALL(SELECT i FROM (values(1), (null)) integers(i) WHERE i=i1.i OR i IS NULL) FROM (values(1), (2)) i1(i) ORDER BY i; {noformat} will return: {noformat} null null {noformat} or {noformat} false false {noformat} (standard defines only true|false) as a return result for EVERY aggregate instead: {noformat} true null {noformat} was observed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6417) Map value constructor and Array value constructor unparsed incorrectly for HiveSqlDialect
xiong duan created CALCITE-6417: --- Summary: Map value constructor and Array value constructor unparsed incorrectly for HiveSqlDialect Key: CALCITE-6417 URL: https://issues.apache.org/jira/browse/CALCITE-6417 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: xiong duan Assignee: xiong duan Fix For: 1.38.0 The SQL: {code:java} SELECT MAP['k1', 'v1', 'k2', 'v2'],ARRAY[1, 2, 3]{code} Generate Hive SQL should be: {code:java} SELECT MAP ('k1', 'v1', 'k2', 'v2'),ARRAY (1, 2, 3){code} But is: {code:java} SELECT MAP['k1', 'v1', 'k2', 'v2'],ARRAY[1, 2, 3]{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6416) Remove unnecessary SUBSTRING rewrite in SparkSqlDialect
xiong duan created CALCITE-6416: --- Summary: Remove unnecessary SUBSTRING rewrite in SparkSqlDialect Key: CALCITE-6416 URL: https://issues.apache.org/jira/browse/CALCITE-6416 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: xiong duan Assignee: xiong duan Fix For: 1.38.0 SparkSqlDialect have a unnecessary rewrite about SUBSTRING func. In CALCITE-3247, We handle the SUBSTRING rewrite in HiveSqlDialect. In CALCITE-3072, We handle the SUBSTRING rewrite in SparkSqlDialect. In CALCITE-5677, We refactor the SUBSTRING as the default behaviour and remove the SUBSTRING rewrite in HiveSqlDialect. This PR will remove the Spark. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6415) Invalid unparse for TIMESTAMP with HiveSqlDialect
xiong duan created CALCITE-6415: --- Summary: Invalid unparse for TIMESTAMP with HiveSqlDialect Key: CALCITE-6415 URL: https://issues.apache.org/jira/browse/CALCITE-6415 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: xiong duan Assignee: xiong duan Fix For: 1.38.0 When parsing {code:java} SELECT CAST("2023-11-10" AS TIMESTAMP) {code} The unparsed Hive SQL query gives: {code:java} SELECT CAST("2023-11-10" AS TIMESTAMP(0)) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6414) Resolve Snoflake SQL generation of BOOL_OR, BOOL_AND
Justin Swett created CALCITE-6414: - Summary: Resolve Snoflake SQL generation of BOOL_OR, BOOL_AND Key: CALCITE-6414 URL: https://issues.apache.org/jira/browse/CALCITE-6414 Project: Calcite Issue Type: Bug Components: core Reporter: Justin Swett The rewriteMacMInExpr is generating incorrect SQL for Snowflake. Instead of BOOL_OR, it should be BOOLOR_AGG and instead BOOL_AND, it should be BOOLAND_AGG I think adding the following expectation in testMaxMinOnBooleanColumn should repro: ``` @Test void testMaxMinOnBooleanColumn() { ... final String expectedSnowflake = "SELECT BOOLOR_AGG(\"brand_name\" = 'a'), " + "BOOLAND_AGG(\"brand_name\" = 'a'), " + "MIN(\"brand_name\")\n" + "FROM \"foodmart\".\"product\""; sql(query) .ok(expected) .withSnowflake().ok(expectedSnowflake) } ``` -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6413) SqlValidator does not invoke type coercion for both NATURAL and USING join conditions
Maksim Zhuravkov created CALCITE-6413: - Summary: SqlValidator does not invoke type coercion for both NATURAL and USING join conditions Key: CALCITE-6413 URL: https://issues.apache.org/jira/browse/CALCITE-6413 Project: Calcite Issue Type: Improvement Reporter: Maksim Zhuravkov This can be observed by adding these test cases to `SqlToRelConverterTest`: 1. Join condition ON expression {code:java} @Test void test1() { final String sql = "select * from emp JOIN (VALUES ('XXX')) t(deptno) ON emp.deptno = t.deptno"; sql(sql).ok(); } {code} 2. Common columns (USING/NATURAL) (since they both share the same code path for building join condition) {code:java} @Test void test2() { final String sql = "select * from emp JOIN (VALUES ('XXX')) t(deptno) USING (deptno)"; sql(sql).ok(); } {code} When test 1 runs, the SqlValidator calls TypeCoercionImpl::binaryComparisonCoercion When test 2 runs, the SqlValidator does not call TypeCoercionImpl::binaryComparisonCoercion. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6412) SqlToRelConverter fails to convert SqlMerge AST to relational expression
Pavel Pereslegin created CALCITE-6412: - Summary: SqlToRelConverter fails to convert SqlMerge AST to relational expression Key: CALCITE-6412 URL: https://issues.apache.org/jira/browse/CALCITE-6412 Project: Calcite Issue Type: Bug Reporter: Pavel Pereslegin {{SqlToRelConverter#convertMerge}} fails if the insert node contains all columns, but in an order different from the order specified by the schema. {noformat} class org.apache.calcite.rel.logical.LogicalTableScan cannot be cast to class org.apache.calcite.rel.logical.LogicalProject (org.apache.calcite.rel.logical.LogicalTableScan and org.apache.calcite.rel.logical.LogicalProject are in unnamed module of loader 'app') java.lang.ClassCastException: class org.apache.calcite.rel.logical.LogicalTableScan cannot be cast to class org.apache.calcite.rel.logical.LogicalProject (org.apache.calcite.rel.logical.LogicalTableScan and org.apache.calcite.rel.logical.LogicalProject are in unnamed module of loader 'app') at org.apache.calcite.sql2rel.SqlToRelConverter.convertMerge(SqlToRelConverter.java:4343) at org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3818) at org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:614) at org.apache.calcite.sql.test.AbstractSqlTester.convertSqlToRel2(AbstractSqlTester.java:545) at org.apache.calcite.sql.test.AbstractSqlTester.assertSqlConvertsTo(AbstractSqlTester.java:486) at org.apache.calcite.sql.test.AbstractSqlTester.assertConvertsTo(AbstractSqlTester.java:464) at org.apache.calcite.test.SqlToRelFixture.convertsTo(SqlToRelFixture.java:106) at org.apache.calcite.test.SqlToRelFixture.ok(SqlToRelFixture.java:94) {noformat} Some more details. The following query is working fine. {code:sql} merge into empnullables e using emp t on e.empno = t.empno when matched then update set ename = t.ename, deptno = t.deptno, sal = t.sal * .1 when not matched then insert (empno, ename, job, mgr, hiredate, sal, comm, deptno, slacker) values(t.empno, t.ename, t.job, t.mgr, t.hiredate, t.sal, t.COMM, t.deptno, t.slacker) {code} But if we swap the "empno" and "ename" columns in the "not matched" section {code:sql} ... when not matched then insert (ename, empno, job, mgr, hiredate, sal, comm, deptno, slacker) values(t.ename, t.empno, t.job, t.mgr, t.hiredate, t.sal, t.COMM, t.deptno, t.slacker) {code} {{convertMerge}} fails due to missing projection in {{LogicalTableModify}} input. I'm not sure if projection is mandatory in this case (then it's a bug of convertInsert()), if not, then convertMerge() should handle the lack of projection properly. Full reproducer (for SqlToRelConverterTest). {code:java} @Test void testMergeAllColumns() { final String sql = "merge into empnullables e\n" + "using emp t on e.empno = t.empno\n" + "when matched then update\n" + "set ename = t.ename, deptno = t.deptno, sal = t.sal * .1\n" + "when not matched then insert (ename, empno, job, mgr, hiredate, sal, comm, deptno, slacker)\n" + "values(t.ename, t.empno, t.job, t.mgr, t.hiredate, t.sal, t.COMM, t.deptno, t.slacker)"; sql(sql).ok(); } {code} {code:xml} {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6411) Support Collect in ToLogicalConverter
Caican Cai created CALCITE-6411: --- Summary: Support Collect in ToLogicalConverter Key: CALCITE-6411 URL: https://issues.apache.org/jira/browse/CALCITE-6411 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: Caican Cai Fix For: 1.38.0 Support Collect operator expression convert to Logical in ToLogicalConverter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6410) dateadd(MONTH, 3, date '2016-02-24') parsing failed
Caican Cai created CALCITE-6410: --- Summary: dateadd(MONTH, 3, date '2016-02-24') parsing failed Key: CALCITE-6410 URL: https://issues.apache.org/jira/browse/CALCITE-6410 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Caican Cai Fix For: 1.38.0 The dateadd and datediff tests are missing in SqlOperatorTest. I tried to add them to improve it. The test code is as follows {code:java} @Test void testDateAdd2() { final SqlOperatorFixture f = Fixtures.forOperators(true) .setFor(SqlLibraryOperators.DATEADD); MONTH_VARIANTS.forEach(s -> f.checkScalar("dateadd(" + s + ", 3, date '2016-02-24')", "2016-05-24 12:42:25", "TIMESTAMP(0) NOT NULL")); } {code} fail message: java.lang.RuntimeException: Error while parsing query: values (dateadd(MONTH, 3, date '2016-02-24')) at org.apache.calcite.sql.test.AbstractSqlTester.parseAndValidate(AbstractSqlTester.java:159) at org.apache.calcite.sql.test.AbstractSqlTester.validateAndThen(AbstractSqlTester.java:250) at org.apache.calcite.test.SqlOperatorFixtureImpl.lambda$forEachQueryValidateAndThen$1(SqlOperatorFixtureImpl.java:154) at org.apache.calcite.sql.test.AbstractSqlTester.forEachQuery(AbstractSqlTester.java:450) at org.apache.calcite.test.SqlOperatorFixtureImpl.forEachQueryValidateAndThen(SqlOperatorFixtureImpl.java:153) at org.apache.calcite.test.SqlOperatorFixtureImpl.checkType(SqlOperatorFixtureImpl.java:130) at org.apache.calcite.sql.test.SqlOperatorFixture.checkScalar(SqlOperatorFixture.java:220) at org.apache.calcite.test.SqlOperatorTest.lambda$testDateAdd2$150(SqlOperatorTest.java:12789) at java.util.Arrays$ArrayList.forEach(Arrays.java:3880) at org.apache.calcite.test.SqlOperatorTest.testDateAdd2(SqlOperatorTest.java:12788) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727) at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60) at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131) at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156) at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147) at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86) at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103) at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93) at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106) at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64) at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45) at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37) at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92) at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86) at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:217) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:213) at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:138) at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141) at org.junit.platform.engine.sup
[jira] [Created] (CALCITE-6409) Character types and Boolean types are comparable
Caican Cai created CALCITE-6409: --- Summary: Character types and Boolean types are comparable Key: CALCITE-6409 URL: https://issues.apache.org/jira/browse/CALCITE-6409 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: Caican Cai Fix For: 1.38.0 In the SameOperandTypeChecker method, the char type and the boolean type are comparable because of the call to the isComparable method. Comparability of char types and boolean types returns true in the isComparable method. However, char types and Boolean types in Spark are incomparable. Does Calcite have fixed standards here? -- This message was sent by Atlassian Jira (v8.20.10#820010)