Re: Exception:Catch clause is unreachable

2020-07-21 Thread Danny Chan
It seems that you got some problem with the generated code, take this [1] to 
debug the code and to see which line caused the compilation error.

[1] 
https://calcite.apache.org/docs/howto.html#debugging-generated-classes-in-intellij

Best,
Danny Chan
在 2020年7月22日 +0800 AM12:17,taojin ,写道:
> hi folks,
>
> I has met a compiler error on calcite-core:1.23.0,jdk version is
> :openjdk version "11.0.7" 2020-04-14.
>
> Here is the error log,is this a bug?
>
> Exception in thread "main" java.sql.SQLException: Error while executing
> SQL "select * from test.t": Error while compiling generated Java code:
> public org.apache.calcite.linq4j.Enumerable bind(final
> org.apache.calcite.DataContext root) {
>   final org.apache.calcite.linq4j.function.Function1 rowBuilderFactory
> = new org.apache.calcite.linq4j.function.Function1() {
>     public org.apache.calcite.linq4j.function.Function0 apply(final
> java.sql.ResultSet resultSet) {
>   return new org.apache.calcite.linq4j.function.Function0() {
>   public Object apply() {
>     try {
>   return new Object[0];
>     } catch (java.sql.SQLException e) {
>   throw new RuntimeException(
>     e);
>     }
>   }
>     }
>   ;
>     }
>     public Object apply(final Object resultSet) {
>   return apply(
>     (java.sql.ResultSet) resultSet);
>     }
>   }
>   ;
>   final org.apache.calcite.runtime.ResultSetEnumerable enumerable =
> org.apache.calcite.runtime.ResultSetEnumerable.of((javax.sql.DataSource)
> root.getRootSchema().getSubSchema("test").unwrap(javax.sql.DataSource.class),
> "SELECT *\nFROM hdb.t", rowBuilderFactory);
>   enumerable.setTimeout(root);
>   return enumerable;
> }
>
>
> public Class getElementType() {
>   return org.apache.calcite.runtime.FlatLists.ComparableList.class;
> }
>
>
>
>     at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
>     at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
>     at
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163)
>     at
> org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227)
>     at mtest.App5.main(App5.java:45)
> Caused by: java.lang.RuntimeException: Error while compiling generated
> Java code:
> public org.apache.calcite.linq4j.Enumerable bind(final
> org.apache.calcite.DataContext root) {
>   final org.apache.calcite.linq4j.function.Function1 rowBuilderFactory
> = new org.apache.calcite.linq4j.function.Function1() {
>     public org.apache.calcite.linq4j.function.Function0 apply(final
> java.sql.ResultSet resultSet) {
>   return new org.apache.calcite.linq4j.function.Function0() {
>   public Object apply() {
>     try {
>   return new Object[0];
>     } catch (java.sql.SQLException e) {
>   throw new RuntimeException(
>     e);
>     }
>   }
>     }
>   ;
>     }
>     public Object apply(final Object resultSet) {
>   return apply(
>     (java.sql.ResultSet) resultSet);
>     }
>   }
>   ;
>   final org.apache.calcite.runtime.ResultSetEnumerable enumerable =
> org.apache.calcite.runtime.ResultSetEnumerable.of((javax.sql.DataSource)
> root.getRootSchema().getSubSchema("test").unwrap(javax.sql.DataSource.class),
> "SELECT *\nFROM hdb.t", rowBuilderFactory);
>   enumerable.setTimeout(root);
>   return enumerable;
> }
>
>
> public Class getElementType() {
>   return org.apache.calcite.runtime.FlatLists.ComparableList.class;
> }
>
>
>
>     at org.apache.calcite.avatica.Helper.wrap(Helper.java:37)
>     at
> org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:128)
>     at
> org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:)
>     at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:309)
>     at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:208)
>     at
> org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:632)
>     at
> org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:498)
>     at
> org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:468)
>     at
> org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:231)
>     at
> org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:552)
>     at
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
>     at
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
>     ... 2 more
> Caused by: org.codehaus.commons.compiler.CompileException: Line 8,
> Column 20: Catch clause is unreachable
>     at
> org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:12211)
>     at
> 

Re: [DISCUSS] New Join Type: ANTI_NOTIN

2020-07-21 Thread Julian Hyde
Haisheng,

You are confusing logical and physical algebra.

A target DB might have physical operators that can implement several logical 
operators. Just because an IN list is represented as an OR it doesn’t mean that 
it has to be executed as an OR.

We don’t HAVE TO add every physical operation into our logical algebra. We may 
choose to. There is a tradeoff - if we add a feature to our logical language, 
we have to revisit all of our transformation rules to deal with it.

Vladimir said he *expected* Oracle would implement (3-valued) NOT IN 
efficiently. (Back in the day, when I was at Oracle, they certainly did not.) 
Does anyone have any evidence that they do?

Julian


> On Jul 21, 2020, at 1:17 PM, Haisheng Yuan  wrote:
> 
> Thanks Julian for explaining the NOT IN to everyone. I am sorry for not 
> making myself clear.
> 
> For that kind of simple 2-valued logic uncorrelated NOT IN subquery, SQL 
> Server and Greenplum can generate efficient plan with null aware ANTI SEMI 
> join.
> 
> Hash ANTI SEMI Join for NOT IN 
> Join Condition: t1.a = t2.a
> ->  Table Scan on t1  
> ->  Hash 
>   ->  Table Scan on t2 
> 
> ANTI_NOTIN can be interpreted as null awareness ANTI joins. As vlsi pointed 
> out, Oracle supports that too.
> 
> Of course, that requires the query executor to deal specifically with this 
> kind of join. But optimizer should not be the database's bottleneck, it 
> should generate as good plan as possible, if the query executor can support 
> the operator, that is good. If not, downstream project just disable it.
> 
> Just like "a in (1,2,3...1000)", many databases (Greenplum, Postgres, Tidb 
> ...) support implicit lookup inside the Filter operator, it doesn't need to 
> be expanded to OR, or an explicit inner join. The implicit lookup is more 
> efficient. I think the downstream would love to adapt to support the good 
> plan, if not, they can choose to disable it, or rewrite to the form they 
> support during post-processing stage after best plan is generated. But we 
> can't limit us not to generate this kind of plan because some downstream 
> project might not support it.
> 
> On 2020/07/21 18:41:39, Julian Hyde  wrote: 
>> I want to remind everyone how hard it is to evaluate NOT IN queries.
>> Here is an example query:
>> 
>> sqlline> !connect
>> jdbc:calcite:model=core/build/resources/test/hsqldb-model.json sa sa
>>> !set outputFormat csv
>>> EXPLAIN PLAN FOR SELECT * FROM dept WHERE deptno NOT IN (SELECT mgr FROM 
>>> emp);
>> EnumerableCalc(expr#0..6=[{inputs}], expr#7=[0], expr#8=[=($t3, $t7)],
>> expr#9=[IS NULL($t6)], expr#10=[>=($t4, $t3)], expr#11=[AND($t9,
>> $t10)], expr#12=[OR($t8, $t11)], proj#0..2=[{exprs}],
>> $condition=[$t12])
>>  EnumerableHashJoin(condition=[=($0, $5)], joinType=[left])
>>EnumerableNestedLoopJoin(condition=[true], joinType=[inner])
>>  JdbcToEnumerableConverter
>>JdbcTableScan(table=[[SCOTT, DEPT]])
>>  JdbcToEnumerableConverter
>>JdbcAggregate(group=[{}], c=[COUNT()], ck=[COUNT($3)])
>>  JdbcTableScan(table=[[SCOTT, EMP]])
>>JdbcToEnumerableConverter
>>  JdbcAggregate(group=[{0, 1}])
>>JdbcProject(MGR=[$3], i=[true])
>>  JdbcTableScan(table=[[SCOTT, EMP]])
>>> EXPLAIN PLAN WITHOUT IMPLEMENTATION FOR SELECT * FROM dept WHERE deptno NOT 
>>> IN (SELECT mgr FROM emp);
>> LogicalProject(DEPTNO=[$0], DNAME=[$1], LOC=[$2])
>>  LogicalFilter(condition=[NOT(IN($0, {
>> LogicalProject(MGR=[$3])
>>  JdbcTableScan(table=[[SCOTT, EMP]])
>> }))])
>>JdbcTableScan(table=[[SCOTT, DEPT]])
>> 
>> Why is the physical plan so complicated? Because 'mgr' is nullable. We
>> need to account for 3 different cases:
>> 
>> 1. dept.deptno is NULL (and therefore 'deptno NOT IN ...' evaluates to
>> UNKNOWN for every row where dept.deptno is NULL);
>> 2. dept.deptno is not NULL and the sub-query returns at least one NULL
>> value for mgr (and therefore 'deptno not in ...' evaluates to UNKNOWN
>> for every row);
>> 3. dept.deptno is not NULL and the sub-query returns only non-NULL
>> values of mgr (and therefore 'deptno not in ...' evaluates to TRUE or
>> FALSE for every row).
>> 
>> To distinguish between cases 2 and 3, the plan counts the number of
>> values and the number of not-null values from the sub-query.
>> 
>> I say that 'NOT IN is toxic' because a single null value in the
>> sub-query affects the result. The IN sub-query returns 3 values and
>> relational join can only account for two - match or not match.
>> 
>> There are actually cases where we care about the 3 values of IN. For
>> example 'SELECT *, deptno IN (SELECT mgr FROM emp) FROM dept'. But
>> usually IN occurs inside WHERE, and we can safely fold UNKNOWN into
>> FALSE.
>> 
>> It is tempting to talk about the cases where there are no NULL keys,
>> or UNKNOWN can safely be folded into FALSE. But I think we should be
>> talking about 3-valued IN (e.g. the scalar sub-query in the previous
>> paragraph). If we can 

Re: [VOTE] Release apache-calcite-1.24.0 (release candidate 0)

2020-07-21 Thread Rui Wang
+1 (non-binding)

- downloaded the artifacts, checked hash, verified signature, built and ran
tests. All are OK
- use `diff -r` to compare the artifact and the git tag. Like what Julian
has pointed out, I can find a difference about license but it can be
addressed after this vote.


-Rui

On Tue, Jul 21, 2020 at 6:00 AM Ruben Q L  wrote:

> Hello everyone,
>
> Thanks Chunwei for taking care of this release.
>
> +1 (non binding)
> - Local Calcite build with tests (Windows10 + JDK8): OK
> - Calcite-based application test suite: OK
>
> Similarly to Enrico, I also found some issues with the deprecated version
> of the "standard" rules.
> Initially I had a NPE [1] and also one test that was not working as
> expected. I did not look much into it, but it seemed the same issue that
> Enrico described.
> All these problems were solved when I replaced the deprecated rules with
> their new equivalents.
>
> Best,
> Ruben
>
> [1]
> Caused by: java.lang.NullPointerException: at index 0
> at
>
> com.onwbp.com.google.common.collect.ObjectArrays.checkElementNotNull(ObjectArrays.java:239)
> at
>
> com.onwbp.com.google.common.collect.ObjectArrays.checkElementsNotNull(ObjectArrays.java:230)
> at
>
> com.onwbp.com.google.common.collect.ObjectArrays.checkElementsNotNull(ObjectArrays.java:225)
> at
>
> com.onwbp.com.google.common.collect.ImmutableList.construct(ImmutableList.java:281)
> at
>
> com.onwbp.com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:239)
> at
>
> com.onwbp.com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:209)
> at com.onwbp.org.apache.calcite.tools.RuleSets.ofList(RuleSets.java:41)
>
>
> Le mar. 21 juil. 2020 à 09:35, Enrico Olivelli  a
> écrit :
>
> > +1 (non binding)
> > - verified hashes and checksums
> > - built from sources and run tests (JDK14 on Linux)
> > - run tests of HerdDB and some client application
> >
> > I only had to fix a deprecation warning, changing from
> > ReduceExpressionsRule.FILTER_INSTANCE to
> > CoreRules.FILTER_REDUCE_EXPRESSIONS, see [1] below
> > without the change of CoreRules.FILTER_REDUCE_EXPRESSIONS all of the
> tests
> > of HerdDB failed with a NPE,
> > I debugged the issue with a debugger and
> > the ReduceExpressionsRule.FILTER_INSTANCE at runtime is null, I can't
> > understand why.
> >
> > Not a big deal, changing to CoreRules.FILTER_REDUCE_EXPRESSIONS fixes the
> > issue
> >
> > java.lang.NullPointerException
> > at
> >
> >
> org.apache.calcite.plan.AbstractRelOptPlanner.addRule(AbstractRelOptPlanner.java:147)
> > at
> >
> >
> org.apache.calcite.plan.volcano.VolcanoPlanner.addRule(VolcanoPlanner.java:416)
> > at herddb.sql.CalcitePlanner.runPlanner(CalcitePlanner.java:576)
> > at herddb.sql.CalcitePlanner.translate(CalcitePlanner.java:331)
> > at herddb.core.TestUtils.scan(TestUtils.java:70)
> >
> > [1]
> >
> >
> https://github.com/diennea/herddb/pull/665/files#diff-ca87d7835fc281efa58a8809669017a9R576
> >
> >
> > Enrico
> >
> > Il giorno mar 21 lug 2020 alle ore 06:12 Francis Chuang <
> > francischu...@apache.org> ha scritto:
> >
> > > Thanks for making this release available for voting, Chunwei!
> > >
> > > Verified GPG Signature - OK
> > > Verified SHA512 - OK
> > > Ran tests per HOWTO (./gradlew check) - OK
> > > Quickly skimmed release notes - Looks good, but I agree with Julian's
> > > comments.
> > > Spotted checked a few JARs in the Maven repository - OK
> > >
> > > Environment (OpenJDK:latest docker container):
> > > Gradle 6.3 (via gradlew)
> > > Oracle Linux Server 7.8
> > > openjdk version "14.0.2" 2020-07-14
> > > OpenJDK Runtime Environment (build 14.0.2+12-46)
> > > OpenJDK 64-Bit Server VM (build 14.0.2+12-46, mixed mode, sharing)
> > >
> > > My vote is: +1 (binding)
> > >
> > > Francis
> > >
> > > On 21/07/2020 12:07 pm, Haisheng Yuan wrote:
> > > > Environment:
> > > > Mac OS X 10.15.1, JDK 1.8.0_162
> > > >
> > > > - Checked signatures and checksums, OK
> > > > - Ran unit tests (./gradlew build), OK
> > > >
> > > > +1 (binding)
> > > >
> > > >> * why is 4032 'breaking'?
> > > > With that change, the CalcMergeRule won't match
> PhysicalNode(including
> > > EnumerableCalc) in VolcanoPlanner. Perhaps I should elaborate in the
> > > release notes.
> > > >
> > > >> * why is 3786 breaking? (recomputeDigest was not present in 1.23;
> the
> > > >> remarks about caching digests are useful, so why aren't they in the
> > > >> javadoc?)
> > > > recomputeDigest() has been there since b0dab68 (2012-05-07). I will
> add
> > > the remarks into the javadoc after release.
> > > >
> > > > Thanks,
> > > > Haisheng
> > > >
> > > > On 2020/07/21 01:14:17, Julian Hyde  wrote:
> > > >> Downloaded, checked hashes, built and ran tests on Ubuntu/JDK 14;
> > > >> checked distro against git (see issue 1); reviewed release notes
> (see
> > > >> issue 2).
> > > >>
> > > >> +1 (binding) but issues 1 and 2 need to be fixed right after the
> > > release.
> > > >>
> > > >> Issue 1. License file is not the same as in source control:
> > > >>
> > > >> 

Re: Re: [DISCUSS] New Join Type: ANTI_NOTIN

2020-07-21 Thread Haisheng Yuan
Thanks Julian for explaining the NOT IN to everyone. I am sorry for not making 
myself clear.

For that kind of simple 2-valued logic uncorrelated NOT IN subquery, SQL Server 
and Greenplum can generate efficient plan with null aware ANTI SEMI join.

Hash ANTI SEMI Join for NOT IN 
 Join Condition: t1.a = t2.a
 ->  Table Scan on t1  
 ->  Hash 
   ->  Table Scan on t2 

ANTI_NOTIN can be interpreted as null awareness ANTI joins. As vlsi pointed 
out, Oracle supports that too.

Of course, that requires the query executor to deal specifically with this kind 
of join. But optimizer should not be the database's bottleneck, it should 
generate as good plan as possible, if the query executor can support the 
operator, that is good. If not, downstream project just disable it.

Just like "a in (1,2,3...1000)", many databases (Greenplum, Postgres, Tidb ...) 
support implicit lookup inside the Filter operator, it doesn't need to be 
expanded to OR, or an explicit inner join. The implicit lookup is more 
efficient. I think the downstream would love to adapt to support the good plan, 
if not, they can choose to disable it, or rewrite to the form they support 
during post-processing stage after best plan is generated. But we can't limit 
us not to generate this kind of plan because some downstream project might not 
support it.

On 2020/07/21 18:41:39, Julian Hyde  wrote: 
> I want to remind everyone how hard it is to evaluate NOT IN queries.
> Here is an example query:
> 
> sqlline> !connect
> jdbc:calcite:model=core/build/resources/test/hsqldb-model.json sa sa
> > !set outputFormat csv
> > EXPLAIN PLAN FOR SELECT * FROM dept WHERE deptno NOT IN (SELECT mgr FROM 
> > emp);
> EnumerableCalc(expr#0..6=[{inputs}], expr#7=[0], expr#8=[=($t3, $t7)],
> expr#9=[IS NULL($t6)], expr#10=[>=($t4, $t3)], expr#11=[AND($t9,
> $t10)], expr#12=[OR($t8, $t11)], proj#0..2=[{exprs}],
> $condition=[$t12])
>   EnumerableHashJoin(condition=[=($0, $5)], joinType=[left])
> EnumerableNestedLoopJoin(condition=[true], joinType=[inner])
>   JdbcToEnumerableConverter
> JdbcTableScan(table=[[SCOTT, DEPT]])
>   JdbcToEnumerableConverter
> JdbcAggregate(group=[{}], c=[COUNT()], ck=[COUNT($3)])
>   JdbcTableScan(table=[[SCOTT, EMP]])
> JdbcToEnumerableConverter
>   JdbcAggregate(group=[{0, 1}])
> JdbcProject(MGR=[$3], i=[true])
>   JdbcTableScan(table=[[SCOTT, EMP]])
> > EXPLAIN PLAN WITHOUT IMPLEMENTATION FOR SELECT * FROM dept WHERE deptno NOT 
> > IN (SELECT mgr FROM emp);
> LogicalProject(DEPTNO=[$0], DNAME=[$1], LOC=[$2])
>   LogicalFilter(condition=[NOT(IN($0, {
> LogicalProject(MGR=[$3])
>   JdbcTableScan(table=[[SCOTT, EMP]])
> }))])
> JdbcTableScan(table=[[SCOTT, DEPT]])
> 
> Why is the physical plan so complicated? Because 'mgr' is nullable. We
> need to account for 3 different cases:
> 
> 1. dept.deptno is NULL (and therefore 'deptno NOT IN ...' evaluates to
> UNKNOWN for every row where dept.deptno is NULL);
> 2. dept.deptno is not NULL and the sub-query returns at least one NULL
> value for mgr (and therefore 'deptno not in ...' evaluates to UNKNOWN
> for every row);
> 3. dept.deptno is not NULL and the sub-query returns only non-NULL
> values of mgr (and therefore 'deptno not in ...' evaluates to TRUE or
> FALSE for every row).
> 
> To distinguish between cases 2 and 3, the plan counts the number of
> values and the number of not-null values from the sub-query.
> 
> I say that 'NOT IN is toxic' because a single null value in the
> sub-query affects the result. The IN sub-query returns 3 values and
> relational join can only account for two - match or not match.
> 
> There are actually cases where we care about the 3 values of IN. For
> example 'SELECT *, deptno IN (SELECT mgr FROM emp) FROM dept'. But
> usually IN occurs inside WHERE, and we can safely fold UNKNOWN into
> FALSE.
> 
> It is tempting to talk about the cases where there are no NULL keys,
> or UNKNOWN can safely be folded into FALSE. But I think we should be
> talking about 3-valued IN (e.g. the scalar sub-query in the previous
> paragraph). If we can solve that, we can easily convert to a solution
> for 3-valued NOT IN.
> 
> Julian
> 
> On Mon, Jul 20, 2020 at 11:25 PM Haisheng Yuan  wrote:
> >
> > I think they might be orthogonal.
> > It is all about sub-query.
> >
> > On 2020/07/21 05:48:54, Danny Chan  wrote:
> > > If it is only constant NOT IN predicate, how difficult it is to rewrite 
> > > it into a normal composite AND predicate before entering the planning 
> > > phrase ?
> > >
> > > Best,
> > > Danny Chan
> > > 在 2020年7月21日 +0800 PM12:35,Haisheng Yuan ,写道:
> > > > Thanks Jinpeng for providing a good example for not in subquery.
> > > >
> > > > I 100% agree with you that correlated query won't be represented by 
> > > > ANTI_NOTIN join type, and it is not the proposal's intention. Here what 
> > > > we are discussing is not to use ANTI_NOTIN to represent all the NOT 

Re: Re: [DISCUSS] New Join Type: ANTI_NOTIN

2020-07-21 Thread Vladimir Sitnikov
Just in case: Oracle DB has 'null aware' joins. For instance: 'hash join
anti na' which is a single join with two inputs.

I expect that databases can implement 'not in' semantics efficiently (at
least they should be more efficient than enumerable)

Vladimir


Re: Re: [DISCUSS] New Join Type: ANTI_NOTIN

2020-07-21 Thread Julian Hyde
I want to remind everyone how hard it is to evaluate NOT IN queries.
Here is an example query:

sqlline> !connect
jdbc:calcite:model=core/build/resources/test/hsqldb-model.json sa sa
> !set outputFormat csv
> EXPLAIN PLAN FOR SELECT * FROM dept WHERE deptno NOT IN (SELECT mgr FROM emp);
EnumerableCalc(expr#0..6=[{inputs}], expr#7=[0], expr#8=[=($t3, $t7)],
expr#9=[IS NULL($t6)], expr#10=[>=($t4, $t3)], expr#11=[AND($t9,
$t10)], expr#12=[OR($t8, $t11)], proj#0..2=[{exprs}],
$condition=[$t12])
  EnumerableHashJoin(condition=[=($0, $5)], joinType=[left])
EnumerableNestedLoopJoin(condition=[true], joinType=[inner])
  JdbcToEnumerableConverter
JdbcTableScan(table=[[SCOTT, DEPT]])
  JdbcToEnumerableConverter
JdbcAggregate(group=[{}], c=[COUNT()], ck=[COUNT($3)])
  JdbcTableScan(table=[[SCOTT, EMP]])
JdbcToEnumerableConverter
  JdbcAggregate(group=[{0, 1}])
JdbcProject(MGR=[$3], i=[true])
  JdbcTableScan(table=[[SCOTT, EMP]])
> EXPLAIN PLAN WITHOUT IMPLEMENTATION FOR SELECT * FROM dept WHERE deptno NOT 
> IN (SELECT mgr FROM emp);
LogicalProject(DEPTNO=[$0], DNAME=[$1], LOC=[$2])
  LogicalFilter(condition=[NOT(IN($0, {
LogicalProject(MGR=[$3])
  JdbcTableScan(table=[[SCOTT, EMP]])
}))])
JdbcTableScan(table=[[SCOTT, DEPT]])

Why is the physical plan so complicated? Because 'mgr' is nullable. We
need to account for 3 different cases:

1. dept.deptno is NULL (and therefore 'deptno NOT IN ...' evaluates to
UNKNOWN for every row where dept.deptno is NULL);
2. dept.deptno is not NULL and the sub-query returns at least one NULL
value for mgr (and therefore 'deptno not in ...' evaluates to UNKNOWN
for every row);
3. dept.deptno is not NULL and the sub-query returns only non-NULL
values of mgr (and therefore 'deptno not in ...' evaluates to TRUE or
FALSE for every row).

To distinguish between cases 2 and 3, the plan counts the number of
values and the number of not-null values from the sub-query.

I say that 'NOT IN is toxic' because a single null value in the
sub-query affects the result. The IN sub-query returns 3 values and
relational join can only account for two - match or not match.

There are actually cases where we care about the 3 values of IN. For
example 'SELECT *, deptno IN (SELECT mgr FROM emp) FROM dept'. But
usually IN occurs inside WHERE, and we can safely fold UNKNOWN into
FALSE.

It is tempting to talk about the cases where there are no NULL keys,
or UNKNOWN can safely be folded into FALSE. But I think we should be
talking about 3-valued IN (e.g. the scalar sub-query in the previous
paragraph). If we can solve that, we can easily convert to a solution
for 3-valued NOT IN.

Julian

On Mon, Jul 20, 2020 at 11:25 PM Haisheng Yuan  wrote:
>
> I think they might be orthogonal.
> It is all about sub-query.
>
> On 2020/07/21 05:48:54, Danny Chan  wrote:
> > If it is only constant NOT IN predicate, how difficult it is to rewrite it 
> > into a normal composite AND predicate before entering the planning phrase ?
> >
> > Best,
> > Danny Chan
> > 在 2020年7月21日 +0800 PM12:35,Haisheng Yuan ,写道:
> > > Thanks Jinpeng for providing a good example for not in subquery.
> > >
> > > I 100% agree with you that correlated query won't be represented by 
> > > ANTI_NOTIN join type, and it is not the proposal's intention. Here what 
> > > we are discussing is not to use ANTI_NOTIN to represent all the NOT IN 
> > > sub-queries, that is impossible. Instead, if you take a close look at the 
> > > example query, it is a simple uncorrelated NOT IN sub-query. That is the 
> > > target. Let's focus on that kind of query, ask ourselves this question: 
> > > Can such a simple query be transformed into a ANTI join to make the plan 
> > > efficient?
> > >
> > > Sadly no. The reality is that this kind of query is not uncommon, may be 
> > > much more common than correlated NOT IN sub-queries.
> > >
> > >
> > > Reply to Julian:
> > > > > How about making a sub-query type (in RexSubQuery), so it is gone
> > > > > before we reach algebra.
> > > It will be nice to have a NOT_IN subquery type, without expanding NOT IN 
> > > to NOT(IN).
> > > However, if there is no ANTI_NOTIN in the join type (without reaching 
> > > algebra), does that mean the optimizer still can't generate efficient 
> > > plan for simple NOT IN sub-queries?
> > >
> > > > > ANTI_NOTIN is a terrible name. ANTI means 'opposite' to ANTI_NOTIN is
> > > > > the opposite of NOT IN?!
> > > It depends how people interpret ANTI. You interpret it as "opposite", I 
> > > interpret it as "ANTI JOIN", means "anti join for NOT IN, instead of NOT 
> > > EXISTS". But it is just a naming issue, I am OK to change it whatever 
> > > name that makes sense to the community, as long as it can convey the 
> > > meaning.
> > >
> > > Thanks,
> > > Haisheng
> > >
> > > On 2020/07/21 03:02:20, Jinpeng Wu  wrote:
> > > > Hi.
> > > >
> > > > In some SQL engine, the query
> > > > select * from A where c1 not in ( select c1 from B where 

Exception:Catch clause is unreachable

2020-07-21 Thread taojin

hi folks,

I has met a compiler error on calcite-core:1.23.0,jdk version is 
:openjdk version "11.0.7" 2020-04-14.


Here is the error log,is this a bug?

Exception in thread "main" java.sql.SQLException: Error while executing 
SQL "select * from test.t": Error while compiling generated Java code:
public org.apache.calcite.linq4j.Enumerable bind(final 
org.apache.calcite.DataContext root) {
  final org.apache.calcite.linq4j.function.Function1 rowBuilderFactory 
= new org.apache.calcite.linq4j.function.Function1() {
    public org.apache.calcite.linq4j.function.Function0 apply(final 
java.sql.ResultSet resultSet) {

  return new org.apache.calcite.linq4j.function.Function0() {
  public Object apply() {
    try {
  return new Object[0];
    } catch (java.sql.SQLException e) {
  throw new RuntimeException(
    e);
    }
  }
    }
  ;
    }
    public Object apply(final Object resultSet) {
  return apply(
    (java.sql.ResultSet) resultSet);
    }
  }
  ;
  final org.apache.calcite.runtime.ResultSetEnumerable enumerable = 
org.apache.calcite.runtime.ResultSetEnumerable.of((javax.sql.DataSource) 
root.getRootSchema().getSubSchema("test").unwrap(javax.sql.DataSource.class), 
"SELECT *\nFROM hdb.t", rowBuilderFactory);

  enumerable.setTimeout(root);
  return enumerable;
}


public Class getElementType() {
  return org.apache.calcite.runtime.FlatLists.ComparableList.class;
}



    at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
    at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
    at 
org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163)
    at 
org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227)

    at mtest.App5.main(App5.java:45)
Caused by: java.lang.RuntimeException: Error while compiling generated 
Java code:
public org.apache.calcite.linq4j.Enumerable bind(final 
org.apache.calcite.DataContext root) {
  final org.apache.calcite.linq4j.function.Function1 rowBuilderFactory 
= new org.apache.calcite.linq4j.function.Function1() {
    public org.apache.calcite.linq4j.function.Function0 apply(final 
java.sql.ResultSet resultSet) {

  return new org.apache.calcite.linq4j.function.Function0() {
  public Object apply() {
    try {
  return new Object[0];
    } catch (java.sql.SQLException e) {
  throw new RuntimeException(
    e);
    }
  }
    }
  ;
    }
    public Object apply(final Object resultSet) {
  return apply(
    (java.sql.ResultSet) resultSet);
    }
  }
  ;
  final org.apache.calcite.runtime.ResultSetEnumerable enumerable = 
org.apache.calcite.runtime.ResultSetEnumerable.of((javax.sql.DataSource) 
root.getRootSchema().getSubSchema("test").unwrap(javax.sql.DataSource.class), 
"SELECT *\nFROM hdb.t", rowBuilderFactory);

  enumerable.setTimeout(root);
  return enumerable;
}


public Class getElementType() {
  return org.apache.calcite.runtime.FlatLists.ComparableList.class;
}



    at org.apache.calcite.avatica.Helper.wrap(Helper.java:37)
    at 
org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:128)
    at 
org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:)

    at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:309)
    at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:208)
    at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:632)
    at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:498)
    at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:468)
    at 
org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:231)
    at 
org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:552)
    at 
org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
    at 
org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)

    ... 2 more
Caused by: org.codehaus.commons.compiler.CompileException: Line 8, 
Column 20: Catch clause is unreachable
    at 
org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:12211)
    at 
org.codehaus.janino.UnitCompiler.compileTryCatch(UnitCompiler.java:3110)
    at 
org.codehaus.janino.UnitCompiler.compileTryCatchFinally(UnitCompiler.java:2966)
    at 
org.codehaus.janino.UnitCompiler.compileTryCatchFinallyWithResources(UnitCompiler.java:2770)

    at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2742)
    at org.codehaus.janino.UnitCompiler.access$2300(UnitCompiler.java:215)
    at 
org.codehaus.janino.UnitCompiler$6.visitTryStatement(UnitCompiler.java:1499)
    at 

Exception:Catch clause is unreachable

2020-07-21 Thread tonytao

hi folks,

I had met a compiler error on calcite-core:1.23.0,jdk version is 
:openjdk version "11.0.7" 2020-04-14.


Here is the error log,is this a bug?

Exception in thread "main" java.sql.SQLException: Error while executing 
SQL "select * from test.t": Error while compiling generated Java code:
public org.apache.calcite.linq4j.Enumerable bind(final 
org.apache.calcite.DataContext root) {
  final org.apache.calcite.linq4j.function.Function1 rowBuilderFactory 
= new org.apache.calcite.linq4j.function.Function1() {
    public org.apache.calcite.linq4j.function.Function0 apply(final 
java.sql.ResultSet resultSet) {

  return new org.apache.calcite.linq4j.function.Function0() {
  public Object apply() {
    try {
  return new Object[0];
    } catch (java.sql.SQLException e) {
  throw new RuntimeException(
    e);
    }
  }
    }
  ;
    }
    public Object apply(final Object resultSet) {
  return apply(
    (java.sql.ResultSet) resultSet);
    }
  }
  ;
  final org.apache.calcite.runtime.ResultSetEnumerable enumerable = 
org.apache.calcite.runtime.ResultSetEnumerable.of((javax.sql.DataSource) 
root.getRootSchema().getSubSchema("test").unwrap(javax.sql.DataSource.class), 
"SELECT *\nFROM hdb.t", rowBuilderFactory);

  enumerable.setTimeout(root);
  return enumerable;
}


public Class getElementType() {
  return org.apache.calcite.runtime.FlatLists.ComparableList.class;
}



    at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
    at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
    at 
org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163)
    at 
org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227)

    at mtest.App5.main(App5.java:45)
Caused by: java.lang.RuntimeException: Error while compiling generated 
Java code:
public org.apache.calcite.linq4j.Enumerable bind(final 
org.apache.calcite.DataContext root) {
  final org.apache.calcite.linq4j.function.Function1 rowBuilderFactory 
= new org.apache.calcite.linq4j.function.Function1() {
    public org.apache.calcite.linq4j.function.Function0 apply(final 
java.sql.ResultSet resultSet) {

  return new org.apache.calcite.linq4j.function.Function0() {
  public Object apply() {
    try {
  return new Object[0];
    } catch (java.sql.SQLException e) {
  throw new RuntimeException(
    e);
    }
  }
    }
  ;
    }
    public Object apply(final Object resultSet) {
  return apply(
    (java.sql.ResultSet) resultSet);
    }
  }
  ;
  final org.apache.calcite.runtime.ResultSetEnumerable enumerable = 
org.apache.calcite.runtime.ResultSetEnumerable.of((javax.sql.DataSource) 
root.getRootSchema().getSubSchema("test").unwrap(javax.sql.DataSource.class), 
"SELECT *\nFROM hdb.t", rowBuilderFactory);

  enumerable.setTimeout(root);
  return enumerable;
}


public Class getElementType() {
  return org.apache.calcite.runtime.FlatLists.ComparableList.class;
}



    at org.apache.calcite.avatica.Helper.wrap(Helper.java:37)
    at 
org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:128)
    at 
org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:)

    at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:309)
    at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:208)
    at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:632)
    at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:498)
    at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:468)
    at 
org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:231)
    at 
org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:552)
    at 
org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
    at 
org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)

    ... 2 more
Caused by: org.codehaus.commons.compiler.CompileException: Line 8, 
Column 20: Catch clause is unreachable
    at 
org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:12211)
    at 
org.codehaus.janino.UnitCompiler.compileTryCatch(UnitCompiler.java:3110)
    at 
org.codehaus.janino.UnitCompiler.compileTryCatchFinally(UnitCompiler.java:2966)
    at 
org.codehaus.janino.UnitCompiler.compileTryCatchFinallyWithResources(UnitCompiler.java:2770)

    at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2742)
    at org.codehaus.janino.UnitCompiler.access$2300(UnitCompiler.java:215)
    at 
org.codehaus.janino.UnitCompiler$6.visitTryStatement(UnitCompiler.java:1499)
    at 

Re: [DISCUSS] New RexNode: RexListCmp

2020-07-21 Thread Scott Reynolds
I wanted to reply and share our recent requirement for handling SQL like
the following `error_code IN (3002, 3030)' and the challenges we
faced. For our implementation on top of Apache Kudu, each disjunction
creates a `Scanner' – a resource we need to limit as it represents a
denial of service attack vector (e.g. too many scanners, heap fills
up). Good news for us is Kudu ships with an [`inListPredicate'] and we
expected a plan to include the `SqlKind.IN' as the function which we
could translate into `inListPredicate'. We were surprised when it didn't
do that. We did eventually make this work for our customers with a hack
below but it is not valid plan – for instance unparsing the plan
produces invalid SQL query – and therefore is brittle (but *works*
:fingers-crossed:) .

┌
│ // This is not the correct use of Array.
│ final RelDataType listType =
builder.getTypeFactory().createArrayType(fieldType, -1);
│ return builder.call(SqlStdOperatorTable.IN,
│  builder.field(conditionTableName, columnName),
│  rexBuilder.makeLiteral(resultValue, listType, true));
└
We filed a ticket to do it the correct way, which is to take all the
disjunctions, and "un-parse" them into `inListPredicate' calls *if
possible*. This struck us as pretty dense code *but* would apply to
other disjunctions.

It would be *great* if Calcite shipped with a `RexCall' that our
implementation could translate with little effort into a
`inListPredicate'.


[`inListPredicate']
https://kudu.apache.org/apidocs/org/apache/kudu/client/KuduPredicate.html#newInListPredicate-org.apache.kudu.ColumnSchema-java.util.List-

On Mon, Jul 20, 2020 at 3:09 PM Stamatis Zampetakis 
wrote:

> Another quick thought as far as it concerns the IN operator would be to use
> RexCall as it is right now where the first operand in the list is a
> RexInputRef for instance and the rest are the literals.
> I assume that taking this direction would need to change a bit the
> respective SqlOperator.
>
> I haven't thought of this thoroughly so maybe there are important things
> that I am missing.
>
> Best,
> Stamatis
>
>
> On Tue, Jul 21, 2020 at 12:41 AM Julian Hyde  wrote:
>
> > The name isn't very intuitive.
> >
> > The concept of a list and a comparison operator seems OK. As Vladimir
> > points out, it is somewhat similar to RexSubQuery, so maybe this could
> > be a sub-class (but organizing the data a bit more efficiently).
> >
> > I would be very wary of null semantics. RexNode scalar operators are
> > forced to do 3-valued logic, but this is almost a relational operator
> > and it would be better without that burden.
> >
> > Julian
> >
> >
> >
> > On Mon, Jul 20, 2020 at 3:45 AM Vladimir Sitnikov
> >  wrote:
> > >
> > > >Do you know what is the impact on Enumerable implementation?
> > >
> > > I guess there are plenty of options there.
> > >
> > > The key question regarding RexListCmp is as we introduce a new Rex
> node,
> > > all the planning rules and all engines
> > > must support it somehow.
> > >
> > > Technically speaking, we have RexSubQuery.
> > > Haisheng, have you considered an option to stick with RexSubQuery to
> > avoid
> > > having two more-or-less the same rex classes?
> > >
> > > Vladimir
> >
>


Re: [VOTE] Release apache-calcite-1.24.0 (release candidate 0)

2020-07-21 Thread Ruben Q L
Hello everyone,

Thanks Chunwei for taking care of this release.

+1 (non binding)
- Local Calcite build with tests (Windows10 + JDK8): OK
- Calcite-based application test suite: OK

Similarly to Enrico, I also found some issues with the deprecated version
of the "standard" rules.
Initially I had a NPE [1] and also one test that was not working as
expected. I did not look much into it, but it seemed the same issue that
Enrico described.
All these problems were solved when I replaced the deprecated rules with
their new equivalents.

Best,
Ruben

[1]
Caused by: java.lang.NullPointerException: at index 0
at
com.onwbp.com.google.common.collect.ObjectArrays.checkElementNotNull(ObjectArrays.java:239)
at
com.onwbp.com.google.common.collect.ObjectArrays.checkElementsNotNull(ObjectArrays.java:230)
at
com.onwbp.com.google.common.collect.ObjectArrays.checkElementsNotNull(ObjectArrays.java:225)
at
com.onwbp.com.google.common.collect.ImmutableList.construct(ImmutableList.java:281)
at
com.onwbp.com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:239)
at
com.onwbp.com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:209)
at com.onwbp.org.apache.calcite.tools.RuleSets.ofList(RuleSets.java:41)


Le mar. 21 juil. 2020 à 09:35, Enrico Olivelli  a
écrit :

> +1 (non binding)
> - verified hashes and checksums
> - built from sources and run tests (JDK14 on Linux)
> - run tests of HerdDB and some client application
>
> I only had to fix a deprecation warning, changing from
> ReduceExpressionsRule.FILTER_INSTANCE to
> CoreRules.FILTER_REDUCE_EXPRESSIONS, see [1] below
> without the change of CoreRules.FILTER_REDUCE_EXPRESSIONS all of the tests
> of HerdDB failed with a NPE,
> I debugged the issue with a debugger and
> the ReduceExpressionsRule.FILTER_INSTANCE at runtime is null, I can't
> understand why.
>
> Not a big deal, changing to CoreRules.FILTER_REDUCE_EXPRESSIONS fixes the
> issue
>
> java.lang.NullPointerException
> at
>
> org.apache.calcite.plan.AbstractRelOptPlanner.addRule(AbstractRelOptPlanner.java:147)
> at
>
> org.apache.calcite.plan.volcano.VolcanoPlanner.addRule(VolcanoPlanner.java:416)
> at herddb.sql.CalcitePlanner.runPlanner(CalcitePlanner.java:576)
> at herddb.sql.CalcitePlanner.translate(CalcitePlanner.java:331)
> at herddb.core.TestUtils.scan(TestUtils.java:70)
>
> [1]
>
> https://github.com/diennea/herddb/pull/665/files#diff-ca87d7835fc281efa58a8809669017a9R576
>
>
> Enrico
>
> Il giorno mar 21 lug 2020 alle ore 06:12 Francis Chuang <
> francischu...@apache.org> ha scritto:
>
> > Thanks for making this release available for voting, Chunwei!
> >
> > Verified GPG Signature - OK
> > Verified SHA512 - OK
> > Ran tests per HOWTO (./gradlew check) - OK
> > Quickly skimmed release notes - Looks good, but I agree with Julian's
> > comments.
> > Spotted checked a few JARs in the Maven repository - OK
> >
> > Environment (OpenJDK:latest docker container):
> > Gradle 6.3 (via gradlew)
> > Oracle Linux Server 7.8
> > openjdk version "14.0.2" 2020-07-14
> > OpenJDK Runtime Environment (build 14.0.2+12-46)
> > OpenJDK 64-Bit Server VM (build 14.0.2+12-46, mixed mode, sharing)
> >
> > My vote is: +1 (binding)
> >
> > Francis
> >
> > On 21/07/2020 12:07 pm, Haisheng Yuan wrote:
> > > Environment:
> > > Mac OS X 10.15.1, JDK 1.8.0_162
> > >
> > > - Checked signatures and checksums, OK
> > > - Ran unit tests (./gradlew build), OK
> > >
> > > +1 (binding)
> > >
> > >> * why is 4032 'breaking'?
> > > With that change, the CalcMergeRule won't match PhysicalNode(including
> > EnumerableCalc) in VolcanoPlanner. Perhaps I should elaborate in the
> > release notes.
> > >
> > >> * why is 3786 breaking? (recomputeDigest was not present in 1.23; the
> > >> remarks about caching digests are useful, so why aren't they in the
> > >> javadoc?)
> > > recomputeDigest() has been there since b0dab68 (2012-05-07). I will add
> > the remarks into the javadoc after release.
> > >
> > > Thanks,
> > > Haisheng
> > >
> > > On 2020/07/21 01:14:17, Julian Hyde  wrote:
> > >> Downloaded, checked hashes, built and ran tests on Ubuntu/JDK 14;
> > >> checked distro against git (see issue 1); reviewed release notes (see
> > >> issue 2).
> > >>
> > >> +1 (binding) but issues 1 and 2 need to be fixed right after the
> > release.
> > >>
> > >> Issue 1. License file is not the same as in source control:
> > >>
> > >> diff -r ./LICENSE /tmp/apache-calcite-1.24.0-src/LICENSE
> > >> 177a178,189
> > >>>
> > >>> Additional License files can be found in the 'licenses' folder
> located
> > in the same directory as the LICENSE file (i.e. this file)
> > >>>
> > >>> - Software produced outside the ASF which is available under other
> > licenses (not Apache-2.0)
> > >>>
> > >>> MIT
> > >>> * cobyism:html5shiv:3.7.2
> > >>> * font-awesome:font-awesome-code:4.2.0
> > >>> * gridsim:gridsim:
> > >>> * jekyll:jekyll:
> > >>> * normalize:normalize:3.0.2
> > >>> * respond:respond:1.4.2
> > >>
> > >> Can you fix the release instructions that the 

Re: custom metadata logic

2020-07-21 Thread Tal Glanzman
hi Danny,

I didn't mention, but I use calcite 1.21.0.

have attempted to use the function already with no success; it didn't seem
to actually change the provider.
have used the RelMetadataQuery.THREAD_PROVIDERS directly and it worked
perfectly.

i assume it is because the function RelOptCluster#setMetadataProvider
- in 1.21.0 doesn't use THREAD_PROVIDERS; ([1])
- in master it does use THREAD_PROVIDERS; ([2])

will upgrade to 1.24.0 when it is released and will attempt again.

[1]
https://github.com/apache/calcite/blob/adc1532de853060d24fd0129257a3fae306fb55c/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L135
[2]
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L142

On Tue, Jul 21, 2020 at 11:45 AM Danny Chan  wrote:

> The suggested way to set up the RelMetadataProvider is through
> RelOptCluster#setMetadataProvider [1]. If you want to customize the
> RelMetadataQuery, take the document in RelMetadataQueryBase [2] for a
> reference.
>
> [1]
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L142
> [2]
> https://github.com/apache/calcite/blob/d2c859098acb3210ac5cad90c5ea28a3317f0386/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQueryBase.java#L58
>
> Best,
> Danny Chan
> 在 2020年7月21日 +0800 PM4:18,Tal Glanzman ,写道:
> > thank you it is working great!
> >
> > On Tue, Jul 21, 2020 at 12:45 AM Stamatis Zampetakis 
> > wrote:
> >
> > > Hi Taz,
> > >
> > > If you are relying on the RelMetadataQuery [1] API then you may need
> to set
> > > your provided into THREAD_PROVIDERS in a similar way that it is done in
> > > RelMetadataTest [2].
> > >
> > > Best,
> > > Stamatis
> > >
> > > [1]
> > >
> > >
> https://github.com/apache/calcite/blob/7a462f2b2f78aa12068b691c1e423ea4c8a825e4/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L75
> > > [2]
> > >
> > >
> https://github.com/apache/calcite/blob/7a462f2b2f78aa12068b691c1e423ea4c8a825e4/core/src/test/java/org/apache/calcite/test/RelMetadataTest.java#L893
> > >
> > > On Mon, Jul 20, 2020 at 6:15 PM Tal Glanzman 
> > > wrote:
> > >
> > > > hi,
> > > >
> > > > I am trying to figure out how to add custom logic for providing
> metadata
> > > to
> > > > nodes inside my adapter with no luck.
> > > >
> > > > I have an implementation of my own RelMetadataProvider, as described
> in
> > > the
> > > > docs.
> > > >
> > > > After looking around, i still couldn't find a way to use this
> provider in
> > > > the planning phase
> > > > - no property in the jdbc connection string
> > > > - tried to invoke setMetadataProvider(provider) on the cluster of the
> > > > initial nodes of the tree
> > > >
> > > > drilling a bit, i notice that the provider in use is
> > > > JaninoRelMetadataProvider with provider=DefaultRelMetadataProvider.
> > > >
> > > > Is there a way to do such a thing using the standard execution?
> > > >
> > > > any guidance will be much appreciated, thanks!
> > > >
> > >
>


Calcite-Master - Build # 1848 - Failure

2020-07-21 Thread Apache Jenkins Server
The Apache Jenkins build system has built Calcite-Master (build #1848)

Status: Failure

Check console output at https://builds.apache.org/job/Calcite-Master/1848/ to 
view the results.

Re: custom metadata logic

2020-07-21 Thread Danny Chan
The suggested way to set up the RelMetadataProvider is through 
RelOptCluster#setMetadataProvider [1]. If you want to customize the 
RelMetadataQuery, take the document in RelMetadataQueryBase [2] for a reference.

[1] 
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L142
[2] 
https://github.com/apache/calcite/blob/d2c859098acb3210ac5cad90c5ea28a3317f0386/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQueryBase.java#L58

Best,
Danny Chan
在 2020年7月21日 +0800 PM4:18,Tal Glanzman ,写道:
> thank you it is working great!
>
> On Tue, Jul 21, 2020 at 12:45 AM Stamatis Zampetakis 
> wrote:
>
> > Hi Taz,
> >
> > If you are relying on the RelMetadataQuery [1] API then you may need to set
> > your provided into THREAD_PROVIDERS in a similar way that it is done in
> > RelMetadataTest [2].
> >
> > Best,
> > Stamatis
> >
> > [1]
> >
> > https://github.com/apache/calcite/blob/7a462f2b2f78aa12068b691c1e423ea4c8a825e4/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L75
> > [2]
> >
> > https://github.com/apache/calcite/blob/7a462f2b2f78aa12068b691c1e423ea4c8a825e4/core/src/test/java/org/apache/calcite/test/RelMetadataTest.java#L893
> >
> > On Mon, Jul 20, 2020 at 6:15 PM Tal Glanzman 
> > wrote:
> >
> > > hi,
> > >
> > > I am trying to figure out how to add custom logic for providing metadata
> > to
> > > nodes inside my adapter with no luck.
> > >
> > > I have an implementation of my own RelMetadataProvider, as described in
> > the
> > > docs.
> > >
> > > After looking around, i still couldn't find a way to use this provider in
> > > the planning phase
> > > - no property in the jdbc connection string
> > > - tried to invoke setMetadataProvider(provider) on the cluster of the
> > > initial nodes of the tree
> > >
> > > drilling a bit, i notice that the provider in use is
> > > JaninoRelMetadataProvider with provider=DefaultRelMetadataProvider.
> > >
> > > Is there a way to do such a thing using the standard execution?
> > >
> > > any guidance will be much appreciated, thanks!
> > >
> >


Re: custom metadata logic

2020-07-21 Thread Tal Glanzman
thank you it is working great!

On Tue, Jul 21, 2020 at 12:45 AM Stamatis Zampetakis 
wrote:

> Hi Taz,
>
> If you are relying on the RelMetadataQuery [1] API then you may need to set
> your provided into THREAD_PROVIDERS in a similar way that it is done in
> RelMetadataTest [2].
>
> Best,
> Stamatis
>
> [1]
>
> https://github.com/apache/calcite/blob/7a462f2b2f78aa12068b691c1e423ea4c8a825e4/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L75
> [2]
>
> https://github.com/apache/calcite/blob/7a462f2b2f78aa12068b691c1e423ea4c8a825e4/core/src/test/java/org/apache/calcite/test/RelMetadataTest.java#L893
>
> On Mon, Jul 20, 2020 at 6:15 PM Tal Glanzman 
> wrote:
>
> > hi,
> >
> > I am trying to figure out how to add custom logic for providing metadata
> to
> > nodes inside my adapter with no luck.
> >
> > I have an implementation of my own RelMetadataProvider, as described in
> the
> > docs.
> >
> > After looking around, i still couldn't find a way to use this provider in
> > the planning phase
> > - no property in the jdbc connection string
> > - tried to invoke setMetadataProvider(provider) on the cluster of the
> > initial nodes of the tree
> >
> > drilling a bit, i notice that the provider in use is
> > JaninoRelMetadataProvider with provider=DefaultRelMetadataProvider.
> >
> > Is there a way to do such a thing using the standard execution?
> >
> > any guidance will be much appreciated, thanks!
> >
>


Re: [VOTE] Release apache-calcite-1.24.0 (release candidate 0)

2020-07-21 Thread Enrico Olivelli
+1 (non binding)
- verified hashes and checksums
- built from sources and run tests (JDK14 on Linux)
- run tests of HerdDB and some client application

I only had to fix a deprecation warning, changing from
ReduceExpressionsRule.FILTER_INSTANCE to
CoreRules.FILTER_REDUCE_EXPRESSIONS, see [1] below
without the change of CoreRules.FILTER_REDUCE_EXPRESSIONS all of the tests
of HerdDB failed with a NPE,
I debugged the issue with a debugger and
the ReduceExpressionsRule.FILTER_INSTANCE at runtime is null, I can't
understand why.

Not a big deal, changing to CoreRules.FILTER_REDUCE_EXPRESSIONS fixes the
issue

java.lang.NullPointerException
at
org.apache.calcite.plan.AbstractRelOptPlanner.addRule(AbstractRelOptPlanner.java:147)
at
org.apache.calcite.plan.volcano.VolcanoPlanner.addRule(VolcanoPlanner.java:416)
at herddb.sql.CalcitePlanner.runPlanner(CalcitePlanner.java:576)
at herddb.sql.CalcitePlanner.translate(CalcitePlanner.java:331)
at herddb.core.TestUtils.scan(TestUtils.java:70)

[1]
https://github.com/diennea/herddb/pull/665/files#diff-ca87d7835fc281efa58a8809669017a9R576


Enrico

Il giorno mar 21 lug 2020 alle ore 06:12 Francis Chuang <
francischu...@apache.org> ha scritto:

> Thanks for making this release available for voting, Chunwei!
>
> Verified GPG Signature - OK
> Verified SHA512 - OK
> Ran tests per HOWTO (./gradlew check) - OK
> Quickly skimmed release notes - Looks good, but I agree with Julian's
> comments.
> Spotted checked a few JARs in the Maven repository - OK
>
> Environment (OpenJDK:latest docker container):
> Gradle 6.3 (via gradlew)
> Oracle Linux Server 7.8
> openjdk version "14.0.2" 2020-07-14
> OpenJDK Runtime Environment (build 14.0.2+12-46)
> OpenJDK 64-Bit Server VM (build 14.0.2+12-46, mixed mode, sharing)
>
> My vote is: +1 (binding)
>
> Francis
>
> On 21/07/2020 12:07 pm, Haisheng Yuan wrote:
> > Environment:
> > Mac OS X 10.15.1, JDK 1.8.0_162
> >
> > - Checked signatures and checksums, OK
> > - Ran unit tests (./gradlew build), OK
> >
> > +1 (binding)
> >
> >> * why is 4032 'breaking'?
> > With that change, the CalcMergeRule won't match PhysicalNode(including
> EnumerableCalc) in VolcanoPlanner. Perhaps I should elaborate in the
> release notes.
> >
> >> * why is 3786 breaking? (recomputeDigest was not present in 1.23; the
> >> remarks about caching digests are useful, so why aren't they in the
> >> javadoc?)
> > recomputeDigest() has been there since b0dab68 (2012-05-07). I will add
> the remarks into the javadoc after release.
> >
> > Thanks,
> > Haisheng
> >
> > On 2020/07/21 01:14:17, Julian Hyde  wrote:
> >> Downloaded, checked hashes, built and ran tests on Ubuntu/JDK 14;
> >> checked distro against git (see issue 1); reviewed release notes (see
> >> issue 2).
> >>
> >> +1 (binding) but issues 1 and 2 need to be fixed right after the
> release.
> >>
> >> Issue 1. License file is not the same as in source control:
> >>
> >> diff -r ./LICENSE /tmp/apache-calcite-1.24.0-src/LICENSE
> >> 177a178,189
> >>>
> >>> Additional License files can be found in the 'licenses' folder located
> in the same directory as the LICENSE file (i.e. this file)
> >>>
> >>> - Software produced outside the ASF which is available under other
> licenses (not Apache-2.0)
> >>>
> >>> MIT
> >>> * cobyism:html5shiv:3.7.2
> >>> * font-awesome:font-awesome-code:4.2.0
> >>> * gridsim:gridsim:
> >>> * jekyll:jekyll:
> >>> * normalize:normalize:3.0.2
> >>> * respond:respond:1.4.2
> >>
> >> Can you fix the release instructions that the generated LICENSE needs
> >> to be committed (probably at the same time you revise the release
> >> notes).
> >>
> >> Issue 2. Release notes
> >>
> >> For the 'highlights', I prefer a paragraph with hyperlinks over a list
> >> (see
> https://github.com/apache/calcite/blob/calcite-1.24.0-rc0/site/_docs/history.md#1180--2018-12-21
> ).
> >>
> >> Regarding categorization:
> >> * why is 4032 'breaking'?
> >> * why is 3786 breaking? (recomputeDigest was not present in 1.23; the
> >> remarks about caching digests are useful, so why aren't they in the
> >> javadoc?)
> >> * we need a note that a bunch of methods are deprecated in this
> >> release and will be removed before 1.25 (see 3923, 4023 and 4079).
> >> This will break semantic versioning in 1.25, so is a big deal.
> >> * 4073, 3224, 4056, 4008, 3972, 4060 are listed as new features, but I
> >> think they are bug fixes or improved implementations
> >> * 3946, 4089, 4087 are listed as fixes but could be listed as new
> features
> >> * 4075 should be under 'test suite'
> >> * 4094 description does not need 'follow-up after review comments'
> >> * 4086 is an upgrade, so should be in 'bug fixes', not documentation
> >> * A few places SQL and Java keywords are not in code font (e.g. NPE,
> >> IllegalArgumentException, RexNode, Expression, HAVING, ARRAY, MAP,
> >> CAST)
> >>
> >> Julian
> >>
> >> On Mon, Jul 20, 2020 at 12:01 PM Michael Mior  wrote:
> >>>
> >>> +1
> >>>
> >>> Checked hash and signature and compiled and ran tests. 

Re: Re: [DISCUSS] New Join Type: ANTI_NOTIN

2020-07-21 Thread Haisheng Yuan
I think they might be orthogonal.
It is all about sub-query.

On 2020/07/21 05:48:54, Danny Chan  wrote: 
> If it is only constant NOT IN predicate, how difficult it is to rewrite it 
> into a normal composite AND predicate before entering the planning phrase ?
> 
> Best,
> Danny Chan
> 在 2020年7月21日 +0800 PM12:35,Haisheng Yuan ,写道:
> > Thanks Jinpeng for providing a good example for not in subquery.
> >
> > I 100% agree with you that correlated query won't be represented by 
> > ANTI_NOTIN join type, and it is not the proposal's intention. Here what we 
> > are discussing is not to use ANTI_NOTIN to represent all the NOT IN 
> > sub-queries, that is impossible. Instead, if you take a close look at the 
> > example query, it is a simple uncorrelated NOT IN sub-query. That is the 
> > target. Let's focus on that kind of query, ask ourselves this question: Can 
> > such a simple query be transformed into a ANTI join to make the plan 
> > efficient?
> >
> > Sadly no. The reality is that this kind of query is not uncommon, may be 
> > much more common than correlated NOT IN sub-queries.
> >
> >
> > Reply to Julian:
> > > > How about making a sub-query type (in RexSubQuery), so it is gone
> > > > before we reach algebra.
> > It will be nice to have a NOT_IN subquery type, without expanding NOT IN to 
> > NOT(IN).
> > However, if there is no ANTI_NOTIN in the join type (without reaching 
> > algebra), does that mean the optimizer still can't generate efficient plan 
> > for simple NOT IN sub-queries?
> >
> > > > ANTI_NOTIN is a terrible name. ANTI means 'opposite' to ANTI_NOTIN is
> > > > the opposite of NOT IN?!
> > It depends how people interpret ANTI. You interpret it as "opposite", I 
> > interpret it as "ANTI JOIN", means "anti join for NOT IN, instead of NOT 
> > EXISTS". But it is just a naming issue, I am OK to change it whatever name 
> > that makes sense to the community, as long as it can convey the meaning.
> >
> > Thanks,
> > Haisheng
> >
> > On 2020/07/21 03:02:20, Jinpeng Wu  wrote:
> > > Hi.
> > >
> > > In some SQL engine, the query
> > > select * from A where c1 not in ( select c1 from B where B.c2 = A.c2);
> > > is transformed to a plan like
> > > select * from A LEFT ANTI JOIN B on A.c2 = B.c2 AND (A.c1 = B.c1 OR A.c1 
> > > is
> > > null OR B.c1 is null);
> > >
> > > Here, the "LEFT ANTI JOIN" is nothing more than traditional definition. 
> > > One
> > > thing seems to be a problem is that A.c1 cannot be used as a join key in
> > > the new plan. However, the problem is also there for ANTI_NOTIN, and even
> > > other NOT-IN-SUBQUERY physical implementations.
> > >
> > > Thanks,
> > > Qiupeng
> > >
> > > On Tue, Jul 21, 2020 at 5:30 AM Julian Hyde  wrote:
> > >
> > > > How about making a sub-query type (in RexSubQuery), so it is gone
> > > > before we reach algebra.
> > > >
> > > > ANTI_NOTIN is a terrible name. ANTI means 'opposite' to ANTI_NOTIN is
> > > > the opposite of NOT IN?!
> > > >
> > > > On Mon, Jul 20, 2020 at 1:00 PM Haisheng Yuan  wrote:
> > > > >
> > > > > Typo:
> > > > > We can just add a security guard saying that it is supported.
> > > > > Should be
> > > > > We can just add a security guard saying that it is NOT supported.
> > > > >
> > > > > On 2020/07/20 19:57:34, Haisheng Yuan  wrote:
> > > > > > I am not sure I got your implication by "pollute". If you mean
> > > > changes, yes, it requires some changes in rules. Do we need to change
> > > > enumerables? Not necessary. We can just add a security guard saying 
> > > > that it
> > > > is supported. Not everyone requires the Enumerable operators to support
> > > > everything. More importantly, currently there is no logic or rules to
> > > > translate sub-query directly to SEMI/ANTI joins, let alone translating
> > > > directly to ANTI_NOTIN. Currently NOT IN is expanded to NOT(IN ...) 
> > > > before
> > > > entering RelNode land. That means we don't even have the chance to 
> > > > generate
> > > > the NOT IN anti join. Is that still a concern?
> > > > > >
> > > > > > Even if some day, some contributor extends Calcite's parser and
> > > > SubqueryRemovalRule to be able to transform NOT_IN subquery into NOT IN
> > > > anti join, we still have chance to disable it. Is that still a concern?
> > > > > >
> > > > > > There are many ways to play it safe.
> > > > > >
> > > > > > > Brainstorming: maybe we could consider it as a separate logical
> > > > operator
> > > > > > > (with its corresponding enumerable implementation)?
> > > > > > It doesn't sound cool. It requires much more work. You have to
> > > > duplicate all the rules, metadata handler that deal with LogicalJoin, 
> > > > and
> > > > for some rule that matches Join base class, you have to check it is a
> > > > LogicalJoin or the logical operator for ANTI_NOTIN.
> > > > > >
> > > > > > On 2020/07/20 08:28:42, Ruben Q L  wrote:
> > > > > > > I have some concerns that this new type would "pollute" the 
> > > > > > > existing
> > > > Join
> > > > > > > logic, rules and