Re: [DISCUSS] Running Sql Logic Tests for Calcite
Thank you very much Mihai Budiu and Julian, we may use this test tool in the future Julian Hyde 于2023年4月30日周日 09:12写道: > An update on this, for those of you not following > https://issues.apache.org/jira/browse/CALCITE-5615. We agreed to move > most of Mihai's code (except for the Calcite-specific code) into a new > project, sql-logic-test. Today I made release 0.1 of that project, and > published the artifacts to Maven Central. > > Thanks to Mihai and Stamatis for their contributions. The announcement is > on Twitter: https://twitter.com/julianhyde/status/1652409133180817408 > > The next step will be to rework 5615 to use the > net.hydromatic:sql-logic-test library in Calcite's test suite. > > Julian > > > > On 2023/04/17 17:34:41 Julian Hyde wrote: > > I agree with Stamatis that this has a similar “shape” to Quidem. I’d be > happy to host the project under github.com/hydromatic. (If the maven > group is net.hydromatic I can publish artifacts to Maven Central and > Calcite could depend on those artifacts.) > > > > Regarding the frequency of testing. If we add it to CI and (say) 5% of > the tests fail, I would find that demoralizing, even though passing 95% of > the tests is actually a great achievement. So I would only deploy it as > part of CI if there is a way to exclude failing tests. > > > > If the SqlLogicTest tool were defined in another repo, then there could > be a Calcite module under plus [1] similar to TpchTest. > > > > Julian > > > > [1] https://github.com/apache/calcite/tree/main/plus > > > > > > > > > On Apr 17, 2023, at 1:58 AM, Stamatis Zampetakis > wrote: > > > > > > Hey Mihai, > > > > > > Thanks for starting this discussion! > > > > > > Let's focus on the first question for now: > > > > > > Q1: Should the new slt module under PR-3145 [1] become part of Calcite > > > repo or get its own? > > > > > > For those who have not followed the discussion under the CALCITE-5615 > > > [2] let me try to summarize a few things as per my understanding; > > > Mihai can amend/correct things if necessary. > > > > > > The new slt module resembles a port of sqllogictest utility [3] to > > > Java. It can parse and understand the test-script format used in > > > sqllogictest and can run this scripts over JDBC compliant databases. > > > It also accounts for extensions for Java engines without a JDBC > > > interface. > > > > > > From my perspective, the code in [1] could perfectly stand on its own > > > in a separate repo; there are already ports of sqllogictest in other > > > languages such as Rust [4] and the latter appears to be quite popular. > > > The sqllocitest parser/runner presents some similarities with the > > > Quidem [5] executor that we are using for certain tests in Calcite. > > > The Quidem project has its own repo although we are making use of it > > > in Calcite. > > > If it becomes a separate repo then the test scripts could also become > > > part of the project making it more self-contained. > > > > > > On the other hand, we already have a testkit module in Calcite so > > > bringing in new modules for testing purposes is relevant so why not > > > slt as well. If it becomes part of Calcite it can get more visibility > > > and facilitate maintenance since more people would be able to review > > > and merge changes (not only Mihai). > > > > > > Since we are talking about a new module I would like to see some more > > > people share their opinion on the topic before I continue the review. > > > > > > Best, > > > Stamatis > > > > > > [1] https://github.com/apache/calcite/pull/3145 > > > [2] https://issues.apache.org/jira/browse/CALCITE-5615 > > > [3] https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki > > > [4] https://github.com/risinglightdb/sqllogictest-rs > > > [5] https://github.com/julianhyde/quidem > > > > > > > > > > > > On Sat, Apr 15, 2023 at 11:31 AM Michael Mior > wrote: > > >> > > >> Very cool! One approach could be to add set these tests to run > periodically > > >> (daily/weekly) as opposed to being part of the CI pipeline. That way > we > > >> still have a mechanism to keep tabs on bugs but the whole build isn't > > >> slow/broken until this is fixed. > > >> > > >> On Fri, Apr 14, 2023, 15:20 Mihai Budiu wrote: > > >> > > >>> Hello all, > > >>> > > >>> I have submitted a PR for Calcite with a standalone executable that > runs > > >>> the Sql Logic Test suite of 7+ million tests from sqlite. > > >>> > > >>> This is the JIRA case: > https://issues.apache.org/jira/browse/CALCITE-5615 > > >>> And this is the PR: https://github.com/apache/calcite/pull/3145 > > >>> > > >>> As Stamatis pointed out, the PR isn't really specific to Calcite, it > is a > > >>> general framework in Java to run these tests on any JDBC compliant > > >>> executor. So a question is whether this belongs to the Calcite > project, or > > >>> some place else. sqlite is a C project, I didn't see any Java in > their > > >>> source tree. > > >>> > > >>> Please note that SQLite is in the public domain, so
Re: [DISCUSS] Running Sql Logic Tests for Calcite
An update on this, for those of you not following https://issues.apache.org/jira/browse/CALCITE-5615. We agreed to move most of Mihai's code (except for the Calcite-specific code) into a new project, sql-logic-test. Today I made release 0.1 of that project, and published the artifacts to Maven Central. Thanks to Mihai and Stamatis for their contributions. The announcement is on Twitter: https://twitter.com/julianhyde/status/1652409133180817408 The next step will be to rework 5615 to use the net.hydromatic:sql-logic-test library in Calcite's test suite. Julian On 2023/04/17 17:34:41 Julian Hyde wrote: > I agree with Stamatis that this has a similar “shape” to Quidem. I’d be happy > to host the project under github.com/hydromatic. (If the maven group is > net.hydromatic I can publish artifacts to Maven Central and Calcite could > depend on those artifacts.) > > Regarding the frequency of testing. If we add it to CI and (say) 5% of the > tests fail, I would find that demoralizing, even though passing 95% of the > tests is actually a great achievement. So I would only deploy it as part of > CI if there is a way to exclude failing tests. > > If the SqlLogicTest tool were defined in another repo, then there could be a > Calcite module under plus [1] similar to TpchTest. > > Julian > > [1] https://github.com/apache/calcite/tree/main/plus > > > > > On Apr 17, 2023, at 1:58 AM, Stamatis Zampetakis wrote: > > > > Hey Mihai, > > > > Thanks for starting this discussion! > > > > Let's focus on the first question for now: > > > > Q1: Should the new slt module under PR-3145 [1] become part of Calcite > > repo or get its own? > > > > For those who have not followed the discussion under the CALCITE-5615 > > [2] let me try to summarize a few things as per my understanding; > > Mihai can amend/correct things if necessary. > > > > The new slt module resembles a port of sqllogictest utility [3] to > > Java. It can parse and understand the test-script format used in > > sqllogictest and can run this scripts over JDBC compliant databases. > > It also accounts for extensions for Java engines without a JDBC > > interface. > > > > From my perspective, the code in [1] could perfectly stand on its own > > in a separate repo; there are already ports of sqllogictest in other > > languages such as Rust [4] and the latter appears to be quite popular. > > The sqllocitest parser/runner presents some similarities with the > > Quidem [5] executor that we are using for certain tests in Calcite. > > The Quidem project has its own repo although we are making use of it > > in Calcite. > > If it becomes a separate repo then the test scripts could also become > > part of the project making it more self-contained. > > > > On the other hand, we already have a testkit module in Calcite so > > bringing in new modules for testing purposes is relevant so why not > > slt as well. If it becomes part of Calcite it can get more visibility > > and facilitate maintenance since more people would be able to review > > and merge changes (not only Mihai). > > > > Since we are talking about a new module I would like to see some more > > people share their opinion on the topic before I continue the review. > > > > Best, > > Stamatis > > > > [1] https://github.com/apache/calcite/pull/3145 > > [2] https://issues.apache.org/jira/browse/CALCITE-5615 > > [3] https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki > > [4] https://github.com/risinglightdb/sqllogictest-rs > > [5] https://github.com/julianhyde/quidem > > > > > > > > On Sat, Apr 15, 2023 at 11:31 AM Michael Mior wrote: > >> > >> Very cool! One approach could be to add set these tests to run periodically > >> (daily/weekly) as opposed to being part of the CI pipeline. That way we > >> still have a mechanism to keep tabs on bugs but the whole build isn't > >> slow/broken until this is fixed. > >> > >> On Fri, Apr 14, 2023, 15:20 Mihai Budiu wrote: > >> > >>> Hello all, > >>> > >>> I have submitted a PR for Calcite with a standalone executable that runs > >>> the Sql Logic Test suite of 7+ million tests from sqlite. > >>> > >>> This is the JIRA case: https://issues.apache.org/jira/browse/CALCITE-5615 > >>> And this is the PR: https://github.com/apache/calcite/pull/3145 > >>> > >>> As Stamatis pointed out, the PR isn't really specific to Calcite, it is a > >>> general framework in Java to run these tests on any JDBC compliant > >>> executor. So a question is whether this belongs to the Calcite project, or > >>> some place else. sqlite is a C project, I didn't see any Java in their > >>> source tree. > >>> > >>> Please note that SQLite is in the public domain, so their licensing terms > >>> are not an obstacle to using the test scripts. > >>> > >>> The submitted code runs Calcite in its default configuration, but the > >>> intent is for other projects that build Calcite-based compilers to be able > >>> to test them by subclassing the "TestExecutors". In our
Re: [DISCUSS] Running Sql Logic Tests for Calcite
I agree with Stamatis that this has a similar “shape” to Quidem. I’d be happy to host the project under github.com/hydromatic. (If the maven group is net.hydromatic I can publish artifacts to Maven Central and Calcite could depend on those artifacts.) Regarding the frequency of testing. If we add it to CI and (say) 5% of the tests fail, I would find that demoralizing, even though passing 95% of the tests is actually a great achievement. So I would only deploy it as part of CI if there is a way to exclude failing tests. If the SqlLogicTest tool were defined in another repo, then there could be a Calcite module under plus [1] similar to TpchTest. Julian [1] https://github.com/apache/calcite/tree/main/plus > On Apr 17, 2023, at 1:58 AM, Stamatis Zampetakis wrote: > > Hey Mihai, > > Thanks for starting this discussion! > > Let's focus on the first question for now: > > Q1: Should the new slt module under PR-3145 [1] become part of Calcite > repo or get its own? > > For those who have not followed the discussion under the CALCITE-5615 > [2] let me try to summarize a few things as per my understanding; > Mihai can amend/correct things if necessary. > > The new slt module resembles a port of sqllogictest utility [3] to > Java. It can parse and understand the test-script format used in > sqllogictest and can run this scripts over JDBC compliant databases. > It also accounts for extensions for Java engines without a JDBC > interface. > > From my perspective, the code in [1] could perfectly stand on its own > in a separate repo; there are already ports of sqllogictest in other > languages such as Rust [4] and the latter appears to be quite popular. > The sqllocitest parser/runner presents some similarities with the > Quidem [5] executor that we are using for certain tests in Calcite. > The Quidem project has its own repo although we are making use of it > in Calcite. > If it becomes a separate repo then the test scripts could also become > part of the project making it more self-contained. > > On the other hand, we already have a testkit module in Calcite so > bringing in new modules for testing purposes is relevant so why not > slt as well. If it becomes part of Calcite it can get more visibility > and facilitate maintenance since more people would be able to review > and merge changes (not only Mihai). > > Since we are talking about a new module I would like to see some more > people share their opinion on the topic before I continue the review. > > Best, > Stamatis > > [1] https://github.com/apache/calcite/pull/3145 > [2] https://issues.apache.org/jira/browse/CALCITE-5615 > [3] https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki > [4] https://github.com/risinglightdb/sqllogictest-rs > [5] https://github.com/julianhyde/quidem > > > > On Sat, Apr 15, 2023 at 11:31 AM Michael Mior wrote: >> >> Very cool! One approach could be to add set these tests to run periodically >> (daily/weekly) as opposed to being part of the CI pipeline. That way we >> still have a mechanism to keep tabs on bugs but the whole build isn't >> slow/broken until this is fixed. >> >> On Fri, Apr 14, 2023, 15:20 Mihai Budiu wrote: >> >>> Hello all, >>> >>> I have submitted a PR for Calcite with a standalone executable that runs >>> the Sql Logic Test suite of 7+ million tests from sqlite. >>> >>> This is the JIRA case: https://issues.apache.org/jira/browse/CALCITE-5615 >>> And this is the PR: https://github.com/apache/calcite/pull/3145 >>> >>> As Stamatis pointed out, the PR isn't really specific to Calcite, it is a >>> general framework in Java to run these tests on any JDBC compliant >>> executor. So a question is whether this belongs to the Calcite project, or >>> some place else. sqlite is a C project, I didn't see any Java in their >>> source tree. >>> >>> Please note that SQLite is in the public domain, so their licensing terms >>> are not an obstacle to using the test scripts. >>> >>> The submitted code runs Calcite in its default configuration, but the >>> intent is for other projects that build Calcite-based compilers to be able >>> to test them by subclassing the "TestExecutors". In our own project ( >>> https://github.com/vmware/sql-to-dbsp-compiler) we have done exactly that, >>> and we are not using the JDBC API. >>> >>> The testsuite does find bugs in Calcite, both crashes and incorrect >>> results. So I think it's usefulness is not debated. >>> >>> The second question is about the packaging of this program; right now it >>> has a main() entry point and it prints the results to stderr for human >>> consumption and triage. It is not clear to me how it should be inserted in >>> a CI infrastructure, since running all 7 million tests could take a long >>> time. One possible extension would be to have the program generate a >>> regression test for Calcite for each bug it finds, but I haven't >>> implemented this feature yet (and many failures could be due to the same >>> bug). But even that
Re: [DISCUSS] Running Sql Logic Tests for Calcite
Hey Mihai, Thanks for starting this discussion! Let's focus on the first question for now: Q1: Should the new slt module under PR-3145 [1] become part of Calcite repo or get its own? For those who have not followed the discussion under the CALCITE-5615 [2] let me try to summarize a few things as per my understanding; Mihai can amend/correct things if necessary. The new slt module resembles a port of sqllogictest utility [3] to Java. It can parse and understand the test-script format used in sqllogictest and can run this scripts over JDBC compliant databases. It also accounts for extensions for Java engines without a JDBC interface. >From my perspective, the code in [1] could perfectly stand on its own in a separate repo; there are already ports of sqllogictest in other languages such as Rust [4] and the latter appears to be quite popular. The sqllocitest parser/runner presents some similarities with the Quidem [5] executor that we are using for certain tests in Calcite. The Quidem project has its own repo although we are making use of it in Calcite. If it becomes a separate repo then the test scripts could also become part of the project making it more self-contained. On the other hand, we already have a testkit module in Calcite so bringing in new modules for testing purposes is relevant so why not slt as well. If it becomes part of Calcite it can get more visibility and facilitate maintenance since more people would be able to review and merge changes (not only Mihai). Since we are talking about a new module I would like to see some more people share their opinion on the topic before I continue the review. Best, Stamatis [1] https://github.com/apache/calcite/pull/3145 [2] https://issues.apache.org/jira/browse/CALCITE-5615 [3] https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki [4] https://github.com/risinglightdb/sqllogictest-rs [5] https://github.com/julianhyde/quidem On Sat, Apr 15, 2023 at 11:31 AM Michael Mior wrote: > > Very cool! One approach could be to add set these tests to run periodically > (daily/weekly) as opposed to being part of the CI pipeline. That way we > still have a mechanism to keep tabs on bugs but the whole build isn't > slow/broken until this is fixed. > > On Fri, Apr 14, 2023, 15:20 Mihai Budiu wrote: > > > Hello all, > > > > I have submitted a PR for Calcite with a standalone executable that runs > > the Sql Logic Test suite of 7+ million tests from sqlite. > > > > This is the JIRA case: https://issues.apache.org/jira/browse/CALCITE-5615 > > And this is the PR: https://github.com/apache/calcite/pull/3145 > > > > As Stamatis pointed out, the PR isn't really specific to Calcite, it is a > > general framework in Java to run these tests on any JDBC compliant > > executor. So a question is whether this belongs to the Calcite project, or > > some place else. sqlite is a C project, I didn't see any Java in their > > source tree. > > > > Please note that SQLite is in the public domain, so their licensing terms > > are not an obstacle to using the test scripts. > > > > The submitted code runs Calcite in its default configuration, but the > > intent is for other projects that build Calcite-based compilers to be able > > to test them by subclassing the "TestExecutors". In our own project ( > > https://github.com/vmware/sql-to-dbsp-compiler) we have done exactly that, > > and we are not using the JDBC API. > > > > The testsuite does find bugs in Calcite, both crashes and incorrect > > results. So I think it's usefulness is not debated. > > > > The second question is about the packaging of this program; right now it > > has a main() entry point and it prints the results to stderr for human > > consumption and triage. It is not clear to me how it should be inserted in > > a CI infrastructure, since running all 7 million tests could take a long > > time. One possible extension would be to have the program generate a > > regression test for Calcite for each bug it finds, but I haven't > > implemented this feature yet (and many failures could be due to the same > > bug). But even that mode would not naturally integrate in a CI > > infrastructure. > > > > A simple possibility is for me to just publish the code as an independent > > project on github with an MIT license (the code is derived from our > > MIT-licensed project) and just advertise it to the Calcite community. > > > > I would very much appreciate guidance. > > > > Mihai Budiu > >
Re: [DISCUSS] Running Sql Logic Tests for Calcite
Very cool! One approach could be to add set these tests to run periodically (daily/weekly) as opposed to being part of the CI pipeline. That way we still have a mechanism to keep tabs on bugs but the whole build isn't slow/broken until this is fixed. On Fri, Apr 14, 2023, 15:20 Mihai Budiu wrote: > Hello all, > > I have submitted a PR for Calcite with a standalone executable that runs > the Sql Logic Test suite of 7+ million tests from sqlite. > > This is the JIRA case: https://issues.apache.org/jira/browse/CALCITE-5615 > And this is the PR: https://github.com/apache/calcite/pull/3145 > > As Stamatis pointed out, the PR isn't really specific to Calcite, it is a > general framework in Java to run these tests on any JDBC compliant > executor. So a question is whether this belongs to the Calcite project, or > some place else. sqlite is a C project, I didn't see any Java in their > source tree. > > Please note that SQLite is in the public domain, so their licensing terms > are not an obstacle to using the test scripts. > > The submitted code runs Calcite in its default configuration, but the > intent is for other projects that build Calcite-based compilers to be able > to test them by subclassing the "TestExecutors". In our own project ( > https://github.com/vmware/sql-to-dbsp-compiler) we have done exactly that, > and we are not using the JDBC API. > > The testsuite does find bugs in Calcite, both crashes and incorrect > results. So I think it's usefulness is not debated. > > The second question is about the packaging of this program; right now it > has a main() entry point and it prints the results to stderr for human > consumption and triage. It is not clear to me how it should be inserted in > a CI infrastructure, since running all 7 million tests could take a long > time. One possible extension would be to have the program generate a > regression test for Calcite for each bug it finds, but I haven't > implemented this feature yet (and many failures could be due to the same > bug). But even that mode would not naturally integrate in a CI > infrastructure. > > A simple possibility is for me to just publish the code as an independent > project on github with an MIT license (the code is derived from our > MIT-licensed project) and just advertise it to the Calcite community. > > I would very much appreciate guidance. > > Mihai Budiu >
[DISCUSS] Running Sql Logic Tests for Calcite
Hello all, I have submitted a PR for Calcite with a standalone executable that runs the Sql Logic Test suite of 7+ million tests from sqlite. This is the JIRA case: https://issues.apache.org/jira/browse/CALCITE-5615 And this is the PR: https://github.com/apache/calcite/pull/3145 As Stamatis pointed out, the PR isn't really specific to Calcite, it is a general framework in Java to run these tests on any JDBC compliant executor. So a question is whether this belongs to the Calcite project, or some place else. sqlite is a C project, I didn't see any Java in their source tree. Please note that SQLite is in the public domain, so their licensing terms are not an obstacle to using the test scripts. The submitted code runs Calcite in its default configuration, but the intent is for other projects that build Calcite-based compilers to be able to test them by subclassing the "TestExecutors". In our own project ( https://github.com/vmware/sql-to-dbsp-compiler) we have done exactly that, and we are not using the JDBC API. The testsuite does find bugs in Calcite, both crashes and incorrect results. So I think it's usefulness is not debated. The second question is about the packaging of this program; right now it has a main() entry point and it prints the results to stderr for human consumption and triage. It is not clear to me how it should be inserted in a CI infrastructure, since running all 7 million tests could take a long time. One possible extension would be to have the program generate a regression test for Calcite for each bug it finds, but I haven't implemented this feature yet (and many failures could be due to the same bug). But even that mode would not naturally integrate in a CI infrastructure. A simple possibility is for me to just publish the code as an independent project on github with an MIT license (the code is derived from our MIT-licensed project) and just advertise it to the Calcite community. I would very much appreciate guidance. Mihai Budiu