Re: Extending SqlValidatorImpl with custom nodes
Hi Jonathan, In my opinion, I think SqlCreate or SqlDrop doesn't need pass it to the validator. When create table or drop table sql was parsed to SqlNode(SqlCreate Or SqlDrop),firstly you should determine what type of SqlNode it is and then use the appropriate Handler to handle your DDL statements. Maybe you could refer to dremio-oss, the example code like this : [1] DropTableHandler: https://github.com/dremio/dremio-oss/blob/master/sabot/kernel/src/main/java/com/dremio/exec/planner/sql/handlers/direct/DropTableHandler.java [2] CreateTableHandler: https://github.com/dremio/dremio-oss/blob/master/sabot/kernel/src/main/java/com/dremio/exec/planner/sql/handlers/query/CreateTableHandler.java Hope can help you. Best, Lake Shen Jonathan Sternberg 于2023年4月18日周二 04:42写道: > Hi, > > When attempting to implement new nodes, such as create table or drop table, > and utilize the sql validator, what's the suggested manner for > implementation? > > I see SqlValidatorImpl is the default implementation of the validator, but > I don't see a way to add new node types to the validator. When I create an > SqlCreate or SqlDrop and pass it to the validator, it throws an exception > in "registerQuery" since that node type isn't one of the expected node > types. At the same time, the methods inside of this class are all protected > so I can't extend and overwrite them to handle new node types. > > Is there another recommended way to handle node validation for custom > nodes? Especially top level nodes. > > --Jonathan Sternberg >
[jira] [Created] (CALCITE-5655) Wrong plan for multiple IN/SOME sub-queries with OR predicate
Runkang He created CALCITE-5655: --- Summary: Wrong plan for multiple IN/SOME sub-queries with OR predicate Key: CALCITE-5655 URL: https://issues.apache.org/jira/browse/CALCITE-5655 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.34.0 Reporter: Runkang He When the query contains multiple IN/SOME sub-queries connected with OR predicate in WHERE clause, the result is wrong. The minimal reproducer is below: SQL: select empno from sales.empnullables where deptno in ( select deptno from sales.deptnullables where name = 'dept1') or deptno in ( select deptno from sales.deptnullables where name = 'dept2') The Plan generated by calcite master branch: (Notice the bold part in the downstream LogicalFilter) LogicalProject(EMPNO=[$0]) LogicalProject(EMPNO=[$0], DEPTNO=[$1]) LogicalFilter(condition=[OR(AND(<>($2, 0), IS NOT NULL($5), IS NOT NULL($1)), AND(<>($2, 0), IS NOT NULL($9), IS NOT NULL($1)))]) LogicalJoin(condition=[=($1, $8)], joinType=[left]) LogicalJoin(condition=[true], joinType=[inner]) LogicalJoin(condition=[=($1, $4)], joinType=[left]) LogicalJoin(condition=[true], joinType=[inner]) LogicalProject(EMPNO=[$0], DEPTNO=[$7]) LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]]) LogicalAggregate(group=[{}], c=[COUNT()], ck=[COUNT($0)]) LogicalProject(DEPTNO=[$0]) LogicalFilter(condition=[=($1, 'dept1')]) LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]]) LogicalProject(DEPTNO=[$0], i=[true]) LogicalFilter(condition=[=($1, 'dept1')]) LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]]) LogicalAggregate(group=[{}], c=[COUNT()], ck=[COUNT($0)]) LogicalProject(DEPTNO=[$0]) LogicalFilter(condition=[=($1, 'dept2')]) LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]]) LogicalProject(DEPTNO=[$0], i=[true]) LogicalFilter(condition=[=($1, 'dept2')]) LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]]) The wrong part is that when build the downstream LogicalFilter for the two sub-queries, the filter for the second sub-query is AND(<>($2, 0), IS NOT NULL($9), IS NOT NULL($1)), notice that $2 should be the second sub-query's intermediate table field ct.c(which field index is $6), but now the actual reference is the first sub-query's, this leads to wrong plan, and wrong result. The root cause is that intermediate table alias is the same as the previous sub-query's, but when lookup intermediate table field, it always returns the previous one which is not belong to the current subquery. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Extending SqlValidatorImpl with custom nodes
Hi, When attempting to implement new nodes, such as create table or drop table, and utilize the sql validator, what's the suggested manner for implementation? I see SqlValidatorImpl is the default implementation of the validator, but I don't see a way to add new node types to the validator. When I create an SqlCreate or SqlDrop and pass it to the validator, it throws an exception in "registerQuery" since that node type isn't one of the expected node types. At the same time, the methods inside of this class are all protected so I can't extend and overwrite them to handle new node types. Is there another recommended way to handle node validation for custom nodes? Especially top level nodes. --Jonathan Sternberg
Re: [DISCUSS] Running Sql Logic Tests for Calcite
I agree with Stamatis that this has a similar “shape” to Quidem. I’d be happy to host the project under github.com/hydromatic. (If the maven group is net.hydromatic I can publish artifacts to Maven Central and Calcite could depend on those artifacts.) Regarding the frequency of testing. If we add it to CI and (say) 5% of the tests fail, I would find that demoralizing, even though passing 95% of the tests is actually a great achievement. So I would only deploy it as part of CI if there is a way to exclude failing tests. If the SqlLogicTest tool were defined in another repo, then there could be a Calcite module under plus [1] similar to TpchTest. Julian [1] https://github.com/apache/calcite/tree/main/plus > On Apr 17, 2023, at 1:58 AM, Stamatis Zampetakis wrote: > > Hey Mihai, > > Thanks for starting this discussion! > > Let's focus on the first question for now: > > Q1: Should the new slt module under PR-3145 [1] become part of Calcite > repo or get its own? > > For those who have not followed the discussion under the CALCITE-5615 > [2] let me try to summarize a few things as per my understanding; > Mihai can amend/correct things if necessary. > > The new slt module resembles a port of sqllogictest utility [3] to > Java. It can parse and understand the test-script format used in > sqllogictest and can run this scripts over JDBC compliant databases. > It also accounts for extensions for Java engines without a JDBC > interface. > > From my perspective, the code in [1] could perfectly stand on its own > in a separate repo; there are already ports of sqllogictest in other > languages such as Rust [4] and the latter appears to be quite popular. > The sqllocitest parser/runner presents some similarities with the > Quidem [5] executor that we are using for certain tests in Calcite. > The Quidem project has its own repo although we are making use of it > in Calcite. > If it becomes a separate repo then the test scripts could also become > part of the project making it more self-contained. > > On the other hand, we already have a testkit module in Calcite so > bringing in new modules for testing purposes is relevant so why not > slt as well. If it becomes part of Calcite it can get more visibility > and facilitate maintenance since more people would be able to review > and merge changes (not only Mihai). > > Since we are talking about a new module I would like to see some more > people share their opinion on the topic before I continue the review. > > Best, > Stamatis > > [1] https://github.com/apache/calcite/pull/3145 > [2] https://issues.apache.org/jira/browse/CALCITE-5615 > [3] https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki > [4] https://github.com/risinglightdb/sqllogictest-rs > [5] https://github.com/julianhyde/quidem > > > > On Sat, Apr 15, 2023 at 11:31 AM Michael Mior wrote: >> >> Very cool! One approach could be to add set these tests to run periodically >> (daily/weekly) as opposed to being part of the CI pipeline. That way we >> still have a mechanism to keep tabs on bugs but the whole build isn't >> slow/broken until this is fixed. >> >> On Fri, Apr 14, 2023, 15:20 Mihai Budiu wrote: >> >>> Hello all, >>> >>> I have submitted a PR for Calcite with a standalone executable that runs >>> the Sql Logic Test suite of 7+ million tests from sqlite. >>> >>> This is the JIRA case: https://issues.apache.org/jira/browse/CALCITE-5615 >>> And this is the PR: https://github.com/apache/calcite/pull/3145 >>> >>> As Stamatis pointed out, the PR isn't really specific to Calcite, it is a >>> general framework in Java to run these tests on any JDBC compliant >>> executor. So a question is whether this belongs to the Calcite project, or >>> some place else. sqlite is a C project, I didn't see any Java in their >>> source tree. >>> >>> Please note that SQLite is in the public domain, so their licensing terms >>> are not an obstacle to using the test scripts. >>> >>> The submitted code runs Calcite in its default configuration, but the >>> intent is for other projects that build Calcite-based compilers to be able >>> to test them by subclassing the "TestExecutors". In our own project ( >>> https://github.com/vmware/sql-to-dbsp-compiler) we have done exactly that, >>> and we are not using the JDBC API. >>> >>> The testsuite does find bugs in Calcite, both crashes and incorrect >>> results. So I think it's usefulness is not debated. >>> >>> The second question is about the packaging of this program; right now it >>> has a main() entry point and it prints the results to stderr for human >>> consumption and triage. It is not clear to me how it should be inserted in >>> a CI infrastructure, since running all 7 million tests could take a long >>> time. One possible extension would be to have the program generate a >>> regression test for Calcite for each bug it finds, but I haven't >>> implemented this feature yet (and many failures could be due to the same >>> bug). But even that
Re: [DISCUSS] Running Sql Logic Tests for Calcite
Hey Mihai, Thanks for starting this discussion! Let's focus on the first question for now: Q1: Should the new slt module under PR-3145 [1] become part of Calcite repo or get its own? For those who have not followed the discussion under the CALCITE-5615 [2] let me try to summarize a few things as per my understanding; Mihai can amend/correct things if necessary. The new slt module resembles a port of sqllogictest utility [3] to Java. It can parse and understand the test-script format used in sqllogictest and can run this scripts over JDBC compliant databases. It also accounts for extensions for Java engines without a JDBC interface. >From my perspective, the code in [1] could perfectly stand on its own in a separate repo; there are already ports of sqllogictest in other languages such as Rust [4] and the latter appears to be quite popular. The sqllocitest parser/runner presents some similarities with the Quidem [5] executor that we are using for certain tests in Calcite. The Quidem project has its own repo although we are making use of it in Calcite. If it becomes a separate repo then the test scripts could also become part of the project making it more self-contained. On the other hand, we already have a testkit module in Calcite so bringing in new modules for testing purposes is relevant so why not slt as well. If it becomes part of Calcite it can get more visibility and facilitate maintenance since more people would be able to review and merge changes (not only Mihai). Since we are talking about a new module I would like to see some more people share their opinion on the topic before I continue the review. Best, Stamatis [1] https://github.com/apache/calcite/pull/3145 [2] https://issues.apache.org/jira/browse/CALCITE-5615 [3] https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki [4] https://github.com/risinglightdb/sqllogictest-rs [5] https://github.com/julianhyde/quidem On Sat, Apr 15, 2023 at 11:31 AM Michael Mior wrote: > > Very cool! One approach could be to add set these tests to run periodically > (daily/weekly) as opposed to being part of the CI pipeline. That way we > still have a mechanism to keep tabs on bugs but the whole build isn't > slow/broken until this is fixed. > > On Fri, Apr 14, 2023, 15:20 Mihai Budiu wrote: > > > Hello all, > > > > I have submitted a PR for Calcite with a standalone executable that runs > > the Sql Logic Test suite of 7+ million tests from sqlite. > > > > This is the JIRA case: https://issues.apache.org/jira/browse/CALCITE-5615 > > And this is the PR: https://github.com/apache/calcite/pull/3145 > > > > As Stamatis pointed out, the PR isn't really specific to Calcite, it is a > > general framework in Java to run these tests on any JDBC compliant > > executor. So a question is whether this belongs to the Calcite project, or > > some place else. sqlite is a C project, I didn't see any Java in their > > source tree. > > > > Please note that SQLite is in the public domain, so their licensing terms > > are not an obstacle to using the test scripts. > > > > The submitted code runs Calcite in its default configuration, but the > > intent is for other projects that build Calcite-based compilers to be able > > to test them by subclassing the "TestExecutors". In our own project ( > > https://github.com/vmware/sql-to-dbsp-compiler) we have done exactly that, > > and we are not using the JDBC API. > > > > The testsuite does find bugs in Calcite, both crashes and incorrect > > results. So I think it's usefulness is not debated. > > > > The second question is about the packaging of this program; right now it > > has a main() entry point and it prints the results to stderr for human > > consumption and triage. It is not clear to me how it should be inserted in > > a CI infrastructure, since running all 7 million tests could take a long > > time. One possible extension would be to have the program generate a > > regression test for Calcite for each bug it finds, but I haven't > > implemented this feature yet (and many failures could be due to the same > > bug). But even that mode would not naturally integrate in a CI > > infrastructure. > > > > A simple possibility is for me to just publish the code as an independent > > project on github with an MIT license (the code is derived from our > > MIT-licensed project) and just advertise it to the Calcite community. > > > > I would very much appreciate guidance. > > > > Mihai Budiu > >
[jira] [Created] (CALCITE-5654) Support PostgreSQL's interval syntax
Tim Nieradzik created CALCITE-5654: -- Summary: Support PostgreSQL's interval syntax Key: CALCITE-5654 URL: https://issues.apache.org/jira/browse/CALCITE-5654 Project: Calcite Issue Type: Improvement Components: core Reporter: Tim Nieradzik Assignee: Tim Nieradzik Support day-to-second intervals using PostgreSQL's syntax: select INTERVAL '2 weeks 3 days 4 hours 5 minutes 6 seconds 7 milliseconds' select INTERVAL '-30 day' These should be equivalent to: select INTERVAL '17 04:05:06.007' DAY TO SECOND select -INTERVAL '30 00:00:00.000' DAY TO SECOND -- This message was sent by Atlassian Jira (v8.20.10#820010)