Very cool! One approach could be to add set these tests to run periodically (daily/weekly) as opposed to being part of the CI pipeline. That way we still have a mechanism to keep tabs on bugs but the whole build isn't slow/broken until this is fixed.
On Fri, Apr 14, 2023, 15:20 Mihai Budiu <mbu...@gmail.com> wrote: > Hello all, > > I have submitted a PR for Calcite with a standalone executable that runs > the Sql Logic Test suite of 7+ million tests from sqlite. > > This is the JIRA case: https://issues.apache.org/jira/browse/CALCITE-5615 > And this is the PR: https://github.com/apache/calcite/pull/3145 > > As Stamatis pointed out, the PR isn't really specific to Calcite, it is a > general framework in Java to run these tests on any JDBC compliant > executor. So a question is whether this belongs to the Calcite project, or > some place else. sqlite is a C project, I didn't see any Java in their > source tree. > > Please note that SQLite is in the public domain, so their licensing terms > are not an obstacle to using the test scripts. > > The submitted code runs Calcite in its default configuration, but the > intent is for other projects that build Calcite-based compilers to be able > to test them by subclassing the "TestExecutors". In our own project ( > https://github.com/vmware/sql-to-dbsp-compiler) we have done exactly that, > and we are not using the JDBC API. > > The testsuite does find bugs in Calcite, both crashes and incorrect > results. So I think it's usefulness is not debated. > > The second question is about the packaging of this program; right now it > has a main() entry point and it prints the results to stderr for human > consumption and triage. It is not clear to me how it should be inserted in > a CI infrastructure, since running all 7 million tests could take a long > time. One possible extension would be to have the program generate a > regression test for Calcite for each bug it finds, but I haven't > implemented this feature yet (and many failures could be due to the same > bug). But even that mode would not naturally integrate in a CI > infrastructure. > > A simple possibility is for me to just publish the code as an independent > project on github with an MIT license (the code is derived from our > MIT-licensed project) and just advertise it to the Calcite community. > > I would very much appreciate guidance. > > Mihai Budiu >