Re: Extending SqlValidatorImpl with custom nodes

2023-04-17 Thread LakeShen
Hi Jonathan,

In my opinion, I think SqlCreate or SqlDrop doesn't need pass it to the
validator.
When create table or drop table sql was parsed to SqlNode(SqlCreate Or
SqlDrop),firstly you should determine what type of SqlNode it is and then
use the appropriate Handler to handle your DDL statements.

Maybe you could refer to dremio-oss, the example code like this :
[1] DropTableHandler:
https://github.com/dremio/dremio-oss/blob/master/sabot/kernel/src/main/java/com/dremio/exec/planner/sql/handlers/direct/DropTableHandler.java
[2] CreateTableHandler:
https://github.com/dremio/dremio-oss/blob/master/sabot/kernel/src/main/java/com/dremio/exec/planner/sql/handlers/query/CreateTableHandler.java

Hope can help you.

Best,
Lake Shen

Jonathan Sternberg  于2023年4月18日周二 04:42写道:

> Hi,
>
> When attempting to implement new nodes, such as create table or drop table,
> and utilize the sql validator, what's the suggested manner for
> implementation?
>
> I see SqlValidatorImpl is the default implementation of the validator, but
> I don't see a way to add new node types to the validator. When I create an
> SqlCreate or SqlDrop and pass it to the validator, it throws an exception
> in "registerQuery" since that node type isn't one of the expected node
> types. At the same time, the methods inside of this class are all protected
> so I can't extend and overwrite them to handle new node types.
>
> Is there another recommended way to handle node validation for custom
> nodes? Especially top level nodes.
>
> --Jonathan Sternberg
>


[jira] [Created] (CALCITE-5655) Wrong plan for multiple IN/SOME sub-queries with OR predicate

2023-04-17 Thread Runkang He (Jira)
Runkang He created CALCITE-5655:
---

 Summary: Wrong plan for multiple IN/SOME sub-queries with OR 
predicate
 Key: CALCITE-5655
 URL: https://issues.apache.org/jira/browse/CALCITE-5655
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.34.0
Reporter: Runkang He


When the query contains multiple IN/SOME sub-queries connected with OR 
predicate in WHERE clause, the result is wrong. The minimal reproducer is below:
SQL:
select empno from sales.empnullables
where deptno in (
  select deptno from sales.deptnullables where name = 'dept1')
or deptno in (
  select deptno from sales.deptnullables where name = 'dept2')
The Plan generated by calcite master branch: (Notice the bold part in the 
downstream LogicalFilter)
LogicalProject(EMPNO=[$0])
  LogicalProject(EMPNO=[$0], DEPTNO=[$1])
LogicalFilter(condition=[OR(AND(<>($2, 0), IS NOT NULL($5), IS NOT 
NULL($1)), AND(<>($2, 0), IS NOT NULL($9), IS NOT NULL($1)))])
  LogicalJoin(condition=[=($1, $8)], joinType=[left])
LogicalJoin(condition=[true], joinType=[inner])
  LogicalJoin(condition=[=($1, $4)], joinType=[left])
LogicalJoin(condition=[true], joinType=[inner])
  LogicalProject(EMPNO=[$0], DEPTNO=[$7])
LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]])
  LogicalAggregate(group=[{}], c=[COUNT()], ck=[COUNT($0)])
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[=($1, 'dept1')])
LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]])
LogicalProject(DEPTNO=[$0], i=[true])
  LogicalFilter(condition=[=($1, 'dept1')])
LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]])
  LogicalAggregate(group=[{}], c=[COUNT()], ck=[COUNT($0)])
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[=($1, 'dept2')])
LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]])
LogicalProject(DEPTNO=[$0], i=[true])
  LogicalFilter(condition=[=($1, 'dept2')])
LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]])
The wrong part is that when build the downstream LogicalFilter for the two 
sub-queries, the filter for the second sub-query is AND(<>($2, 0), IS NOT 
NULL($9), IS NOT NULL($1)), notice that $2 should be the second sub-query's 
intermediate table field ct.c(which field index is $6), but now the actual 
reference is the first sub-query's, this leads to wrong plan, and wrong result.
The root cause is that intermediate table alias is the same as the previous 
sub-query's, but when lookup intermediate table field, it always returns the 
previous one which is not belong to the current subquery. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Extending SqlValidatorImpl with custom nodes

2023-04-17 Thread Jonathan Sternberg
Hi,

When attempting to implement new nodes, such as create table or drop table,
and utilize the sql validator, what's the suggested manner for
implementation?

I see SqlValidatorImpl is the default implementation of the validator, but
I don't see a way to add new node types to the validator. When I create an
SqlCreate or SqlDrop and pass it to the validator, it throws an exception
in "registerQuery" since that node type isn't one of the expected node
types. At the same time, the methods inside of this class are all protected
so I can't extend and overwrite them to handle new node types.

Is there another recommended way to handle node validation for custom
nodes? Especially top level nodes.

--Jonathan Sternberg


Re: [DISCUSS] Running Sql Logic Tests for Calcite

2023-04-17 Thread Julian Hyde
I agree with Stamatis that this has a similar “shape” to Quidem. I’d be happy 
to host the project under github.com/hydromatic. (If the maven group is 
net.hydromatic I can publish artifacts to Maven Central and Calcite could 
depend on those artifacts.)

Regarding the frequency of testing. If we add it to CI and (say) 5% of the 
tests fail, I would find that demoralizing, even though passing 95% of the 
tests is actually a great achievement. So I would only deploy it as part of CI 
if there is a way to exclude failing tests.

If the SqlLogicTest tool were defined in another repo, then there could be a 
Calcite module under plus [1] similar to TpchTest.

Julian

[1] https://github.com/apache/calcite/tree/main/plus 



> On Apr 17, 2023, at 1:58 AM, Stamatis Zampetakis  wrote:
> 
> Hey Mihai,
> 
> Thanks for starting this discussion!
> 
> Let's focus on the first question for now:
> 
> Q1: Should the new slt module under PR-3145 [1] become part of Calcite
> repo or get its own?
> 
> For those who have not followed the discussion under the CALCITE-5615
> [2] let me try to summarize a few things as per my understanding;
> Mihai can amend/correct things if necessary.
> 
> The new slt module resembles a port of sqllogictest utility [3] to
> Java. It can parse and understand the test-script format used in
> sqllogictest and can run this scripts over JDBC compliant databases.
> It also accounts for extensions for Java engines without a JDBC
> interface.
> 
> From my perspective, the code in [1] could perfectly stand on its own
> in a separate repo; there are already ports of sqllogictest in other
> languages such as Rust [4] and the latter appears to be quite popular.
> The sqllocitest parser/runner presents some similarities with the
> Quidem [5] executor that we are using for certain tests in Calcite.
> The Quidem project has its own repo although we are making use of it
> in Calcite.
> If it becomes a separate repo then the test scripts could also become
> part of the project making it more self-contained.
> 
> On the other hand, we already have a testkit module in Calcite so
> bringing in new modules for testing purposes is relevant so why not
> slt as well. If it becomes part of Calcite it can get more visibility
> and facilitate maintenance since more people would be able to review
> and merge changes (not only Mihai).
> 
> Since we are talking about a new module I would like to see some more
> people share their opinion on the topic before I continue the review.
> 
> Best,
> Stamatis
> 
> [1] https://github.com/apache/calcite/pull/3145
> [2] https://issues.apache.org/jira/browse/CALCITE-5615
> [3] https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki
> [4] https://github.com/risinglightdb/sqllogictest-rs
> [5] https://github.com/julianhyde/quidem
> 
> 
> 
> On Sat, Apr 15, 2023 at 11:31 AM Michael Mior  wrote:
>> 
>> Very cool! One approach could be to add set these tests to run periodically
>> (daily/weekly) as opposed to being part of the CI pipeline. That way we
>> still have a mechanism to keep tabs on bugs but the whole build isn't
>> slow/broken until this is fixed.
>> 
>> On Fri, Apr 14, 2023, 15:20 Mihai Budiu  wrote:
>> 
>>> Hello all,
>>> 
>>> I have submitted a PR for Calcite with a standalone executable that runs
>>> the Sql Logic Test suite of 7+ million tests from sqlite.
>>> 
>>> This is the JIRA case: https://issues.apache.org/jira/browse/CALCITE-5615
>>> And this is the PR: https://github.com/apache/calcite/pull/3145
>>> 
>>> As Stamatis pointed out, the PR isn't really specific to Calcite, it is a
>>> general framework in Java to run these tests on any JDBC compliant
>>> executor. So a question is whether this belongs to the Calcite project, or
>>> some place else. sqlite is a C project, I didn't see any Java in their
>>> source tree.
>>> 
>>> Please note that SQLite is in the public domain, so their licensing terms
>>> are not an obstacle to using the test scripts.
>>> 
>>> The submitted code runs Calcite in its default configuration, but the
>>> intent is for other projects that build Calcite-based compilers to be able
>>> to test them by subclassing the "TestExecutors". In our own project (
>>> https://github.com/vmware/sql-to-dbsp-compiler) we have done exactly that,
>>> and we are not using the JDBC API.
>>> 
>>> The testsuite does find bugs in Calcite, both crashes and incorrect
>>> results. So I think it's usefulness is not debated.
>>> 
>>> The second question is about the packaging of this program; right now it
>>> has a main() entry point and it prints the results to stderr for human
>>> consumption and triage. It is not clear to me how it should be inserted in
>>> a CI infrastructure, since running all 7 million tests could take a long
>>> time. One possible extension would be to have the program generate a
>>> regression test for Calcite for each bug it finds, but I haven't
>>> implemented this feature yet (and many failures could be due to the same
>>> bug). But even that 

Re: [DISCUSS] Running Sql Logic Tests for Calcite

2023-04-17 Thread Stamatis Zampetakis
Hey Mihai,

Thanks for starting this discussion!

Let's focus on the first question for now:

Q1: Should the new slt module under PR-3145 [1] become part of Calcite
repo or get its own?

For those who have not followed the discussion under the CALCITE-5615
[2] let me try to summarize a few things as per my understanding;
Mihai can amend/correct things if necessary.

The new slt module resembles a port of sqllogictest utility [3] to
Java. It can parse and understand the test-script format used in
sqllogictest and can run this scripts over JDBC compliant databases.
It also accounts for extensions for Java engines without a JDBC
interface.

>From my perspective, the code in [1] could perfectly stand on its own
in a separate repo; there are already ports of sqllogictest in other
languages such as Rust [4] and the latter appears to be quite popular.
The sqllocitest parser/runner presents some similarities with the
Quidem [5] executor that we are using for certain tests in Calcite.
The Quidem project has its own repo although we are making use of it
in Calcite.
If it becomes a separate repo then the test scripts could also become
part of the project making it more self-contained.

On the other hand, we already have a testkit module in Calcite so
bringing in new modules for testing purposes is relevant so why not
slt as well. If it becomes part of Calcite it can get more visibility
and facilitate maintenance since more people would be able to review
and merge changes (not only Mihai).

Since we are talking about a new module I would like to see some more
people share their opinion on the topic before I continue the review.

Best,
Stamatis

[1] https://github.com/apache/calcite/pull/3145
[2] https://issues.apache.org/jira/browse/CALCITE-5615
[3] https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki
[4] https://github.com/risinglightdb/sqllogictest-rs
[5] https://github.com/julianhyde/quidem



On Sat, Apr 15, 2023 at 11:31 AM Michael Mior  wrote:
>
> Very cool! One approach could be to add set these tests to run periodically
> (daily/weekly) as opposed to being part of the CI pipeline. That way we
> still have a mechanism to keep tabs on bugs but the whole build isn't
> slow/broken until this is fixed.
>
> On Fri, Apr 14, 2023, 15:20 Mihai Budiu  wrote:
>
> > Hello all,
> >
> > I have submitted a PR for Calcite with a standalone executable that runs
> > the Sql Logic Test suite of 7+ million tests from sqlite.
> >
> > This is the JIRA case: https://issues.apache.org/jira/browse/CALCITE-5615
> > And this is the PR: https://github.com/apache/calcite/pull/3145
> >
> > As Stamatis pointed out, the PR isn't really specific to Calcite, it is a
> > general framework in Java to run these tests on any JDBC compliant
> > executor. So a question is whether this belongs to the Calcite project, or
> > some place else. sqlite is a C project, I didn't see any Java in their
> > source tree.
> >
> > Please note that SQLite is in the public domain, so their licensing terms
> > are not an obstacle to using the test scripts.
> >
> > The submitted code runs Calcite in its default configuration, but the
> > intent is for other projects that build Calcite-based compilers to be able
> > to test them by subclassing the "TestExecutors". In our own project (
> > https://github.com/vmware/sql-to-dbsp-compiler) we have done exactly that,
> > and we are not using the JDBC API.
> >
> > The testsuite does find bugs in Calcite, both crashes and incorrect
> > results. So I think it's usefulness is not debated.
> >
> > The second question is about the packaging of this program; right now it
> > has a main() entry point and it prints the results to stderr for human
> > consumption and triage. It is not clear to me how it should be inserted in
> > a CI infrastructure, since running all 7 million tests could take a long
> > time. One possible extension would be to have the program generate a
> > regression test for Calcite for each bug it finds, but I haven't
> > implemented this feature yet (and many failures could be due to the same
> > bug). But even that mode would not naturally integrate in a CI
> > infrastructure.
> >
> > A simple possibility is for me to just publish the code as an independent
> > project on github with an MIT license (the code is derived from our
> > MIT-licensed project) and just advertise it to the Calcite community.
> >
> > I would very much appreciate guidance.
> >
> > Mihai Budiu
> >


[jira] [Created] (CALCITE-5654) Support PostgreSQL's interval syntax

2023-04-17 Thread Tim Nieradzik (Jira)
Tim Nieradzik created CALCITE-5654:
--

 Summary: Support PostgreSQL's interval syntax
 Key: CALCITE-5654
 URL: https://issues.apache.org/jira/browse/CALCITE-5654
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Tim Nieradzik
Assignee: Tim Nieradzik


Support day-to-second intervals using PostgreSQL's syntax:
select INTERVAL '2 weeks 3 days 4 hours 5 minutes 6 seconds 7 milliseconds'
select INTERVAL '-30 day'

These should be equivalent to:
select INTERVAL '17 04:05:06.007' DAY TO SECOND
select -INTERVAL '30 00:00:00.000' DAY TO SECOND



--
This message was sent by Atlassian Jira
(v8.20.10#820010)