date:20200219

Parse PostgreSQL query using Calcite-BABEL

2020-02-19 Thread Shivraj Singh

Hi,
Need little more help like I am using calcite-babel for parsing the query:
I have a SQL query: *select * from my_table1 dwm INNER JOIN my_table2 wl ON
(dwm.id  = wl.id  ) where dwm.time -
'timeinterval'::interval*

and this is my config for using BABEL:




*  val sqlParserConfig = SqlParser.configBuilder()
.setParserFactory(SqlBabelParserImpl.FACTORY)
.setConformance(SqlConformanceEnum.BABEL)  .build()*
*SqlParser.create(sql,sqlParserConfig).parseQuery()*

but this gave me exception on *.time (Was expecting one of)*
*Shivraj Singh*
Software Consultant
Knoldus Inc. 
+91-8800782123
Canada - USA - India - Singapore

Re: Correlation with a variable

2020-02-19 Thread Stamatis Zampetakis

I don't know if there is already an issue around this topic, I think not.

Now regarding point (b) I would prefer to keep the plan as independent as
possible from statement and other parts of the execution.

Other than that I encountered a few use-cases where named parameters may be
a better fit:
* Plans with parameters that are created from different components of an
application and need to be combined together. It is more difficult to
ensure that param collisions do not occur when we are using index based
parameters. Having named parameters where each component uses a reserved
prefix makes things slightly easier to handle.
* Using the same parameter (value) in multiple places (which could help in
simplifications) is semantically easier to express with names.
* Printing and debugging a plan.

Certainly there are also benefits in keeping a single RexNode class for
parameters so I am fine with whatever decision we take in the end.

Best,
Stamatis

On Tue, Feb 18, 2020, 9:18 PM Julian Hyde  wrote:

> It’s worth separating this into (a) what the parser should do (i.e. how we
> represent this in SqlNode land) and (b) how we represent this in
> RexNode/RelNode land.
>
> I think the parser should generate some flavor of SqlDynamicParam.
> Currently these only have ‘int index’ but we’d need to hold old/new as a
> boolean and “somecolumn” as a string.
>
> In RexNode land I think these should remain RexDynamicParam with an int
> index. There would be some mapping table in the statement saying that
> old.somecolumn should be bound into RexDynamicParam #0, etc. I believe this
> is what Oracle’s PL/SQL pre-processor does.
>
> SqlToRelConverter would need to be changed in order to map between these
> representations.
>
> Julian
>
>
> > On Feb 18, 2020, at 11:15 AM, Christian Beikov <
> christian.bei...@gmail.com> wrote:
> >
> > Hey Stamatis,
> >
> > thanks for your help.
> >
> > Am 18.02.2020 um 19:34 schrieb Stamatis Zampetakis:
> >> Hi Christian,
> >>
> >> Long story short: maybe you can achieve what you want by adding a new
> type
> >> of RexNode. From what I remember quoting is applied to identifiers
> (schema,
> >> table, column names) so maybe if OLD/NEW are parameters (e.g.,
> >> RexNamedParam) then possibly it does not make sense to quote them.
> > Right, and actually it is wrong to quote these variables/parameters i.e.
> it's an error as PG complains there is no relation with that name.
> >> At row level triggers OLD and NEW refer to a single tuple/row at each
> point
> >> in time. In this case, OLD and NEW are row type variables, or better say
> >> parameters, with the same type as the table. In Calcite, there are
> >> index-based parameters (RexDynamicParam) but not named parameters as the
> >> one you seem to need. I think named parameters are useful in various
> >> scenarios so possibly we could expand the RexNode hierarchy.
> >> In this case to build the plan probably it suffices to create and plug
> the
> >> parameter wherever you need.
> > That sounds great and just about what I would need. Is there an issue
> for this already?
> >> At statement level triggers OLD and NEW refer to set of tuples/rows at
> each
> >> point in time. In this case, OLD and NEW appear as (temporary)
> >> relations/tables with the same type as the table. In terms of
> >> implementation, I assume that the user defined query acts as a subquery
> >> correlated with OLD/NEW as necessary.
> > Correct, but right now I am using row level triggers. I'll try to
> introduce somekind of synthetic view that holds the state so I can use a
> TransientScan for now.
> >> In this case to build the plan probably you need to introduce scan
> >> operations over OLD/NEW tables and create a correlation with the rest of
> >> the query.
> > As of PG10 the REFERENCING clause can be used to introduce a temporary
> view for an old and new relation.
> >> Best,
> >> Stamatis
> >>
> >> On Tue, Feb 18, 2020 at 7:03 PM Christian Beikov <
> christian.bei...@gmail.com>
> >> wrote:
> >>
> >>> My issue is not about parsing. I already have the relational model, I
> >>> parsed a query to which I want to add add a condition to a RelNode.
> Now I
> >>> want to add a RexNode to the LogicalFilter node that renders to:
> >>> NEW."somecolumn"
> >>>
> >>> How would I construct a RexNode that renders to that when converting
> the
> >>> RelNode to SQL. Do I have to extend the SqlDialect to support that?
> >>>
> >>> Danny Chan  schrieb am Di., 18. Feb. 2020,
> 15:12:
> >>>
>  If you want to make NEW a normal sql identifier, you should override
> it
> >>> in
>  the parser to make it unreserved.
> 
>  Christian Beikov 于2020年2月18日 周二下午3:11写道：
> 
> > Hey Danny,
> >
> > it's not a view, it's a variable in PL/SQL with a row type. The thing
> > is, variable names must not be quoted, but I have no idea how to
> avoid
> > quoting for this single use case with the relational algebra model in
> > Calcite.
> >
> > Regards,
> >
> >>

Re: How to trace a column back to its original column

2020-02-19 Thread Seliverstov Igor

Aron Tao,

I think you need the next two methods:

org.apache.calcite.sql.validate.SqlValidatorImpl#getFieldOrigins
java.sql.ResultSetMetaData#getTableName

Regards,
Igor

> 19 февр. 2020 г., в 10:29, JiaTao Tao  написал(а):
> 
> Hi
> Thanks a lot.
> Can you give more details, say which method/field?
> 
> Regards!
> 
> Aron Tao
> 
> 
> Seliverstov Igor  于2020年2月19日周三 下午2:31写道：
> 
>> You can use their origins (says where a column came from).
>> 
>> It's accessable from SqlValidatorImpl or jdbc result set
>> 
>> ср, 19 февр. 2020 г., 9:25 JiaTao Tao :
>> 
>>> What I really need is to collect every part of the SQL, Which columns are
>>> used as filters, which are used as projection(Columns on the source
>>> table), But
>>> the existence of nested subqueries complicates the issue.
>>> 
>>> 
>>> Regards!
>>> 
>>> Aron Tao
>>> 
>>> 
>>> JiaTao Tao  于2020年2月19日周三 下午2:17写道：
>>> 
 SQL like this:
 ```
 SELECT T1.NID,
   T1.NAME,
   T2.COMPANY
 FROM
  (SELECT (U.ID +100) AS NID,
  U.NAME AS NAME
   FROM USERS U) T1
 JOIN
  (SELECT (J.ID +100) AS NID,
  J.COMPANY
   FROM JOBS J) T2 ON T1.NID = T2.NID
 ```
 What I wanted is that T1.NID comes from USER.ID, T1.NAME comes from
 USER.NAME.
 
 Has anyone done similar work? Is there a ready-made example to refer
>> to?
 
 
 Regards!
 
 Aron Tao
 
>>> 
>>

[jira] [Created] (CALCITE-3806) How to optimize repeated RelNode Structures ?

2020-02-19 Thread anjali shrishrimal (Jira)

anjali shrishrimal created CALCITE-3806:
---

 Summary: How to optimize repeated RelNode Structures ?
 Key: CALCITE-3806
 URL: https://issues.apache.org/jira/browse/CALCITE-3806
 Project: Calcite
  Issue Type: Wish
  Components: core
Reporter: anjali shrishrimal


Let's say input structure looks like this :
{noformat}
LogicalUnion(all=[true])
  LogicalProject(EMPNO=[$0])
LogicalFilter(condition=[>=($0, 7369)])
  LogicalTableScan(table=[[scott, EMP]])
  LogicalProject(EMPNO=[$0])
LogicalFilter(condition=[>=($0, 7369)])
  LogicalTableScan(table=[[scott, EMP]]){noformat}
 

In this case,
{noformat}
  LogicalProject(EMPNO=[$0])
LogicalFilter(condition=[>=($0, 7369)])
  LogicalTableScan(table=[[scott, EMP]]){noformat}
is repeated. It is going to fetch same data twice.

Can we save one fetch? Can we somehow tell 2nd input of union to make use of 
union's 1st input. Is there any way to express that in plan?

 

Also,
If the structure was like this :
{noformat}
LogicalUnion(all=[true])
  LogicalProject(EMPNO=[$0])
LogicalFilter(condition=[>=($0, 7369)])
  LogicalTableScan(table=[[scott, EMP]])
  LogicalProject(EMPNO=[$0])
LogicalFilter(condition=[>=($0, 8000)])
  LogicalTableScan(table=[[scott, EMP]]){noformat}

Second part of union can perform filtering on fetched data of 1st part. (As 
second's output is subset of first's output)

 

Does calcite provide such kind of optimizations ?

If not, what are the challenges to do so?

 

 

I would appreciate some comments on this. Thank you.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: How to trace a column back to its original column

2020-02-19 Thread JiaTao Tao

Thanks a lot!

seems I can use
org.apache.calcite.rel.metadata.RelMetadataQuery#getColumnOrigin.

Regards!

Aron Tao


Walaa Eldin Moustafa  于2020年2月19日周三 下午2:36写道：

> You might check out this class [1].
>
> [1]
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/metadata/RelMdExpressionLineage.java
>
> On Tue, Feb 18, 2020 at 10:31 PM Seliverstov Igor 
> wrote:
> >
> > You can use their origins (says where a column came from).
> >
> > It's accessable from SqlValidatorImpl or jdbc result set
> >
> > ср, 19 февр. 2020 г., 9:25 JiaTao Tao :
> >
> > > What I really need is to collect every part of the SQL, Which columns
> are
> > > used as filters, which are used as projection(Columns on the source
> > > table), But
> > > the existence of nested subqueries complicates the issue.
> > >
> > >
> > > Regards!
> > >
> > > Aron Tao
> > >
> > >
> > > JiaTao Tao  于2020年2月19日周三 下午2:17写道：
> > >
> > > > SQL like this:
> > > > ```
> > > > SELECT T1.NID,
> > > >T1.NAME,
> > > >T2.COMPANY
> > > > FROM
> > > >   (SELECT (U.ID +100) AS NID,
> > > >   U.NAME AS NAME
> > > >FROM USERS U) T1
> > > > JOIN
> > > >   (SELECT (J.ID +100) AS NID,
> > > >   J.COMPANY
> > > >FROM JOBS J) T2 ON T1.NID = T2.NID
> > > > ```
> > > > What I wanted is that T1.NID comes from USER.ID, T1.NAME comes from
> > > > USER.NAME.
> > > >
> > > > Has anyone done similar work? Is there a ready-made example to refer
> to?
> > > >
> > > >
> > > > Regards!
> > > >
> > > > Aron Tao
> > > >
> > >
>

Re: How to trace a column back to its original column

2020-02-19 Thread JiaTao Tao

Thanks a lot

But I found if the select is from the subquery, `getFieldOrigins` returns
null, because you can not find the table(see in the pic.).

String sql = "SELECT T1.NAME,\n"
+ "   T1.NID,\n"
+ "   T2.NID,\n"
+ "   T2.COMPANY\n"
+ "FROM\n"
+ "  (SELECT (U.ID +100) AS NID,\n"
+ "  U.NAME AS NAME\n"
+ "   FROM USERS U) T1\n"
+ "JOIN\n"
+ "  (SELECT (J.ID +100) AS NID,\n"
+ "  J.COMPANY\n"
+ "   FROM JOBS J) T2 ON T1.NID = T2.NID";

[image: image.png]


Regards!

Aron Tao


Seliverstov Igor  于2020年2月19日周三 下午5:38写道：

> Aron Tao,
>
> I think you need the next two methods:
>
> org.apache.calcite.sql.validate.SqlValidatorImpl#getFieldOrigins
> java.sql.ResultSetMetaData#getTableName
>
> Regards,
> Igor
>
> > 19 февр. 2020 г., в 10:29, JiaTao Tao  написал(а):
> >
> > Hi
> > Thanks a lot.
> > Can you give more details, say which method/field?
> >
> > Regards!
> >
> > Aron Tao
> >
> >
> > Seliverstov Igor  于2020年2月19日周三 下午2:31写道：
> >
> >> You can use their origins (says where a column came from).
> >>
> >> It's accessable from SqlValidatorImpl or jdbc result set
> >>
> >> ср, 19 февр. 2020 г., 9:25 JiaTao Tao :
> >>
> >>> What I really need is to collect every part of the SQL, Which columns
> are
> >>> used as filters, which are used as projection(Columns on the source
> >>> table), But
> >>> the existence of nested subqueries complicates the issue.
> >>>
> >>>
> >>> Regards!
> >>>
> >>> Aron Tao
> >>>
> >>>
> >>> JiaTao Tao  于2020年2月19日周三 下午2:17写道：
> >>>
>  SQL like this:
>  ```
>  SELECT T1.NID,
>    T1.NAME,
>    T2.COMPANY
>  FROM
>   (SELECT (U.ID +100) AS NID,
>   U.NAME AS NAME
>    FROM USERS U) T1
>  JOIN
>   (SELECT (J.ID +100) AS NID,
>   J.COMPANY
>    FROM JOBS J) T2 ON T1.NID = T2.NID
>  ```
>  What I wanted is that T1.NID comes from USER.ID, T1.NAME comes from
>  USER.NAME.
> 
>  Has anyone done similar work? Is there a ready-made example to refer
> >> to?
> 
> 
>  Regards!
> 
>  Aron Tao
> 
> >>>
> >>
>
>

Translation of SQL EXISTS

2020-02-19 Thread Christian Beikov


Hello,

I'm a bit confused about how the SQL EXISTS predicate is translated. I'd 
assume that an EXISTS is translated in relational algebra to a SEMI- and 
NOT EXISTS to an ANTI-join, but it's not.


PlannerImpl p = new PlannerImpl(config);
SqlNode sqlNode = p.parse("delete from _order o where exists (select 1 
from order_position p where o.id = p.order_id)");

p.validate(sqlNode);
RelRoot rel = p.rel(sqlNode);
RelToSqlConverter sqlConverter = new RelToSqlConverter(dialect);
SqlImplementor.Result result = sqlConverter.visitChild(0, rel.rel);
sqlWriter.format(result.asStatement());

Worse, when printing this, I only get DELETE FROM "public"."_order" i.e. 
the EXISTS part is not rendered. This is the plan I get.


LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE], 
flattened=[true])

  LogicalProject(inputs=[0])
    LogicalProject(inputs=[0], exprs=[[CAST($1):BIGINT, CAST($2):BOOLEAN]])
  LogicalJoin(condition=[=($0, $1)], joinType=[inner])
    JdbcTableScan(table=[[adhoc, _order]])
    LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
  LogicalProject(exprs=[[$1, true]])
    JdbcTableScan(table=[[adhoc, order_position]])

I'd expect something along the lines of

LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE], 
flattened=[true])

  LogicalProject(inputs=[0])
  LogicalJoin(condition=[=($0, $1)], joinType=[semi])
    JdbcTableScan(table=[[adhoc, _order]])
    JdbcTableScan(table=[[adhoc, order_position]])

and for NOT EXISTS

LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE], 
flattened=[true])

  LogicalProject(inputs=[0])
  LogicalJoin(condition=[=($0, $1)], joinType=[anti])
    JdbcTableScan(table=[[adhoc, _order]])
    JdbcTableScan(table=[[adhoc, order_position]])

Am I missing something and the current aggregate function translation 
makes sense?


I constructed relational algebra structures for some other statements 
with SEMI- and ANTI-joins and already noticed that these join types 
weren't handled in 
org.apache.calcite.rel.rel2sql.RelToSqlConverter#visit(org.apache.calcite.rel.core.Join), 
which I fixed locally. Is the lack of a translation intentional?


Is such a translation of SEMI- and ANTI-join to EXISTS and NOT EXISTS an 
over-simplification or would you say it's correct? As far as I 
understood from https://en.wikipedia.org/wiki/Relational_algebra this is 
correct.


I'd be happy to contribute that back. I didn't look into the Sql-to-Rel 
translation for EXISTS and NOT EXISTS to SEMI- and ANTI-join yet, but I 
assume that's not that hard and I could add that.


Regards,

Christian

Build failed in Jenkins: Calcite-Snapshots #115

2020-02-19 Thread Apache Jenkins Server

See 


Changes:

[yuzhao.cyz] Following [CALCITE-3763] Fix slow test failure (DonnyZone)

[yuzhao.cyz] [CALCITE-3801] Deprecate 
SqlToRelConverter.Config#isConvertTableAccess

[yuzhao.cyz] [CALCITE-2707] Information about distinct aggregation is lost in

[chunwei] [CALCITE-3803] Enhance RexSimplify to simplify 'a>1 or (a<3 and b)' to


--
[...truncated 106.67 KB...]

> Configure project :example:csv
Evaluating project ':example:csv' using build file 
'

> Configure project :example:function
Evaluating project ':example:function' using build file 
'
All projects evaluated.
Selected primary task 'publish' from project :
Tasks to be executed: [task ':publish', task ':linq4j:compileJava', task 
':core:compileKotlin', task ':core:fmppMain', task ':core:javaCCMain', task 
':core:versionClass', task ':core:compileJava', task ':babel:compileKotlin', 
task ':babel:fmppMain', task ':babel:javaCCMain', task ':babel:compileJava', 
task ':babel:processResources', task ':babel:classes', task 
':babel:inspectClassesForKotlinIC', task ':babel:jar', task 
':babel:generateMetadataFileForBabelPublication', task 
':babel:generatePomFileForBabelPublication', task ':gitProps', task ':rat', 
task ':validateBeforeBuildingReleaseArtifacts', task 
':validateNexusCredentials', task ':babel:initializeNexusStagingRepository', 
task ':babel:javadoc', task ':babel:javadocJar', task ':babel:sourcesJar', task 
':babel:publishBabelPublicationToNexusRepository', task ':babel:publish', task 
':cassandra:compileJava', task ':cassandra:processResources', task 
':cassandra:classes', task ':cassandra:jar', task 
':cassandra:generateMetadataFileForCassandraPublication', task 
':cassandra:generatePomFileForCassandraPublication', task 
':cassandra:initializeNexusStagingRepository', task ':cassandra:javadoc', task 
':cassandra:javadocJar', task ':cassandra:sourcesJar', task 
':cassandra:publishCassandraPublicationToNexusRepository', task 
':cassandra:publish', task ':core:processResources', task ':core:classes', task 
':core:inspectClassesForKotlinIC', task ':core:jar', task 
':core:generateMetadataFileForCorePublication', task 
':core:generatePomFileForCorePublication', task 
':core:initializeNexusStagingRepository', task ':core:javadoc', task 
':core:javadocJar', task ':core:sourcesJar', task ':core:compileTestKotlin', 
task ':core:fmppTest', task ':core:javaCCTest', task ':core:compileTestJava', 
task ':core:processTestResources', task ':core:testClasses', task 
':core:testJar', task ':core:publishCorePublicationToNexusRepository', task 
':core:publish', task ':druid:compileJava', task ':druid:processResources', 
task ':druid:classes', task ':druid:jar', task 
':druid:generateMetadataFileForDruidPublication', task 
':druid:generatePomFileForDruidPublication', task 
':druid:initializeNexusStagingRepository', task ':druid:javadoc', task 
':druid:javadocJar', task ':druid:sourcesJar', task 
':druid:publishDruidPublicationToNexusRepository', task ':druid:publish', task 
':elasticsearch:compileJava', task ':elasticsearch:processResources', task 
':elasticsearch:classes', task ':elasticsearch:jar', task 
':elasticsearch:generateMetadataFileForElasticsearchPublication', task 
':elasticsearch:generatePomFileForElasticsearchPublication', task 
':elasticsearch:initializeNexusStagingRepository', task 
':elasticsearch:javadoc', task ':elasticsearch:javadocJar', task 
':elasticsearch:sourcesJar', task 
':elasticsearch:publishElasticsearchPublicationToNexusRepository', task 
':elasticsearch:publish', task ':example:csv:compileJava', task 
':file:compileJava', task ':file:processResources', task ':file:classes', task 
':file:jar', task ':file:generateMetadataFileForFilePublication', task 
':file:generatePomFileForFilePublication', task 
':file:initializeNexusStagingRepository', task ':file:javadoc', task 
':file:javadocJar', task ':file:sourcesJar', task 
':file:publishFilePublicationToNexusRepository', task ':file:publish', task 
':geode:compileJava', task ':geode:processResources', task ':geode:classes', 
task ':geode:jar', task ':geode:generateMetadataFileForGeodePublication', task 
':geode:generatePomFileForGeodePublication', task 
':geode:initializeNexusStagingRepository', task ':geode:javadoc', task 
':geode:javadocJar', task ':geode:sourcesJar', task 
':geode:publishGeodePublicationToNexusRepository', task ':geode:publish', task 
':kafka:compileJava', task ':kafka:processResources', task ':kafka:classes', 
task ':kafka:jar', task ':kafka:generateMetadataFileForKafkaPublication', task 
':kafka:generatePomFileForKafkaPublication', task 
':kafka:initializeNexusStagingRepository', task ':kafka:javadoc', task 
':kafka:javadocJar', task ':kafka:sourcesJar', task 
':

Re: How to trace a column back to its original column

2020-02-19 Thread Seliverstov Igor

This case each NID column is a calculated value - origin erases.

You may try to infer origin walking through SQL AST recursively

I can’t come up with another way.

Regards,
Igor

> 19 февр. 2020 г., в 15:02, JiaTao Tao  написал(а):
> 
> Thanks a lot
> 
> But I found if the select is from the subquery, `getFieldOrigins` returns 
> null, because you can not find the table(see in the pic.).
> String sql = "SELECT T1.NAME ,\n"
> + "   T1.NID,\n"
> + "   T2.NID,\n"
> + "   T2.COMPANY\n"
> + "FROM\n"
> + "  (SELECT (U.ID  +100) AS NID,\n"
> + "  U.NAME  AS NAME\n"
> + "   FROM USERS U) T1\n"
> + "JOIN\n"
> + "  (SELECT (J.ID  +100) AS NID,\n"
> + "  J.COMPANY\n"
> + "   FROM JOBS J) T2 ON T1.NID = T2.NID";
> 
> 
> Regards!
> Aron Tao
> 
> 
> Seliverstov Igor mailto:gvvinbl...@gmail.com>> 
> 于2020年2月19日周三 下午5:38写道：
> Aron Tao,
> 
> I think you need the next two methods:
> 
> org.apache.calcite.sql.validate.SqlValidatorImpl#getFieldOrigins
> java.sql.ResultSetMetaData#getTableName
> 
> Regards,
> Igor
> 
> > 19 февр. 2020 г., в 10:29, JiaTao Tao  > > написал(а):
> > 
> > Hi
> > Thanks a lot.
> > Can you give more details, say which method/field?
> > 
> > Regards!
> > 
> > Aron Tao
> > 
> > 
> > Seliverstov Igor mailto:gvvinbl...@gmail.com>> 
> > 于2020年2月19日周三 下午2:31写道：
> > 
> >> You can use their origins (says where a column came from).
> >> 
> >> It's accessable from SqlValidatorImpl or jdbc result set
> >> 
> >> ср, 19 февр. 2020 г., 9:25 JiaTao Tao  >> >:
> >> 
> >>> What I really need is to collect every part of the SQL, Which columns are
> >>> used as filters, which are used as projection(Columns on the source
> >>> table), But
> >>> the existence of nested subqueries complicates the issue.
> >>> 
> >>> 
> >>> Regards!
> >>> 
> >>> Aron Tao
> >>> 
> >>> 
> >>> JiaTao Tao mailto:t...@apache.org>> 于2020年2月19日周三 
> >>> 下午2:17写道：
> >>> 
>  SQL like this:
>  ```
>  SELECT T1.NID,
>    T1.NAME ,
>    T2.COMPANY
>  FROM
>   (SELECT (U.ID  +100) AS NID,
>   U.NAME  AS NAME
>    FROM USERS U) T1
>  JOIN
>   (SELECT (J.ID  +100) AS NID,
>   J.COMPANY
>    FROM JOBS J) T2 ON T1.NID = T2.NID
>  ```
>  What I wanted is that T1.NID comes from USER.ID , 
>  T1.NAME  comes from
>  USER.NAME .
>  
>  Has anyone done similar work? Is there a ready-made example to refer
> >> to?
>  
>  
>  Regards!
>  
>  Aron Tao
>  
> >>> 
> >> 
>

Re: Correlation with a variable

2020-02-19 Thread Julian Hyde

When you say "keep the plan as independent as possible from statement
and other parts of the execution", if you propose to achieve that by
adding names into the RexNodes representing parameters, I don't agree.
I think that RelNode/RexNode should operate based on internal offsets
(retaining names, if possible, as a debugging aid) but not worry about
the names used by the client.

That said, we have RelRoot, which acts as a wrapper and mapping layer.
It adds metadata to the root RelNode, but RelNodes and RexNodes never
reference it.

Note that old.field and new.field is a special case of named
parameters, and 'old' and 'new' are reserved words in the standard for
trigger SQL. I don't think that named parameters in general should
allow '.'.

Julian

On Wed, Feb 19, 2020 at 12:29 AM Stamatis Zampetakis  wrote:
>
> I don't know if there is already an issue around this topic, I think not.
>
> Now regarding point (b) I would prefer to keep the plan as independent as
> possible from statement and other parts of the execution.
>
> Other than that I encountered a few use-cases where named parameters may be
> a better fit:
> * Plans with parameters that are created from different components of an
> application and need to be combined together. It is more difficult to
> ensure that param collisions do not occur when we are using index based
> parameters. Having named parameters where each component uses a reserved
> prefix makes things slightly easier to handle.
> * Using the same parameter (value) in multiple places (which could help in
> simplifications) is semantically easier to express with names.
> * Printing and debugging a plan.
>
> Certainly there are also benefits in keeping a single RexNode class for
> parameters so I am fine with whatever decision we take in the end.
>
> Best,
> Stamatis
>
> On Tue, Feb 18, 2020, 9:18 PM Julian Hyde  wrote:
>
> > It’s worth separating this into (a) what the parser should do (i.e. how we
> > represent this in SqlNode land) and (b) how we represent this in
> > RexNode/RelNode land.
> >
> > I think the parser should generate some flavor of SqlDynamicParam.
> > Currently these only have ‘int index’ but we’d need to hold old/new as a
> > boolean and “somecolumn” as a string.
> >
> > In RexNode land I think these should remain RexDynamicParam with an int
> > index. There would be some mapping table in the statement saying that
> > old.somecolumn should be bound into RexDynamicParam #0, etc. I believe this
> > is what Oracle’s PL/SQL pre-processor does.
> >
> > SqlToRelConverter would need to be changed in order to map between these
> > representations.
> >
> > Julian
> >
> >
> > > On Feb 18, 2020, at 11:15 AM, Christian Beikov <
> > christian.bei...@gmail.com> wrote:
> > >
> > > Hey Stamatis,
> > >
> > > thanks for your help.
> > >
> > > Am 18.02.2020 um 19:34 schrieb Stamatis Zampetakis:
> > >> Hi Christian,
> > >>
> > >> Long story short: maybe you can achieve what you want by adding a new
> > type
> > >> of RexNode. From what I remember quoting is applied to identifiers
> > (schema,
> > >> table, column names) so maybe if OLD/NEW are parameters (e.g.,
> > >> RexNamedParam) then possibly it does not make sense to quote them.
> > > Right, and actually it is wrong to quote these variables/parameters i.e.
> > it's an error as PG complains there is no relation with that name.
> > >> At row level triggers OLD and NEW refer to a single tuple/row at each
> > point
> > >> in time. In this case, OLD and NEW are row type variables, or better say
> > >> parameters, with the same type as the table. In Calcite, there are
> > >> index-based parameters (RexDynamicParam) but not named parameters as the
> > >> one you seem to need. I think named parameters are useful in various
> > >> scenarios so possibly we could expand the RexNode hierarchy.
> > >> In this case to build the plan probably it suffices to create and plug
> > the
> > >> parameter wherever you need.
> > > That sounds great and just about what I would need. Is there an issue
> > for this already?
> > >> At statement level triggers OLD and NEW refer to set of tuples/rows at
> > each
> > >> point in time. In this case, OLD and NEW appear as (temporary)
> > >> relations/tables with the same type as the table. In terms of
> > >> implementation, I assume that the user defined query acts as a subquery
> > >> correlated with OLD/NEW as necessary.
> > > Correct, but right now I am using row level triggers. I'll try to
> > introduce somekind of synthetic view that holds the state so I can use a
> > TransientScan for now.
> > >> In this case to build the plan probably you need to introduce scan
> > >> operations over OLD/NEW tables and create a correlation with the rest of
> > >> the query.
> > > As of PG10 the REFERENCING clause can be used to introduce a temporary
> > view for an old and new relation.
> > >> Best,
> > >> Stamatis
> > >>
> > >> On Tue, Feb 18, 2020 at 7:03 PM Christian Beikov <
> > christian.bei...@gmail.com>
> > >> wrote:

Re: Translation of SQL EXISTS

2020-02-19 Thread Haisheng Yuan

Hi Christian,

For the query in your example, Calcite first generates inner join plan with 
aggregate child, then through SemJoinRule transform the inner join to semi or 
antisemi join. The reason to have inner join is that it allows join 
commutativity, which is good for generating a potential better plan with 
nestedloop join or hash join.

Admittedly, this process in Calcite is counter intuitive. It should be in 
reverse order, first generate a semi or anti-semi join, then generate an 
inner/outer join.

- Haisheng

--
发件人：Christian Beikov
日　期：2020年02月19日 21:12:13
收件人：
主　题：Translation of SQL EXISTS

Hello,

I'm a bit confused about how the SQL EXISTS predicate is translated. I'd 
assume that an EXISTS is translated in relational algebra to a SEMI- and 
NOT EXISTS to an ANTI-join, but it's not.

PlannerImpl p = new PlannerImpl(config);
SqlNode sqlNode = p.parse("delete from _order o where exists (select 1 
from order_position p where o.id = p.order_id)");
p.validate(sqlNode);
RelRoot rel = p.rel(sqlNode);
RelToSqlConverter sqlConverter = new RelToSqlConverter(dialect);
SqlImplementor.Result result = sqlConverter.visitChild(0, rel.rel);
sqlWriter.format(result.asStatement());

Worse, when printing this, I only get DELETE FROM "public"."_order" i.e. 
the EXISTS part is not rendered. This is the plan I get.

LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE], 
flattened=[true])
   LogicalProject(inputs=[0])
 LogicalProject(inputs=[0], exprs=[[CAST($1):BIGINT, CAST($2):BOOLEAN]])
   LogicalJoin(condition=[=($0, $1)], joinType=[inner])
 JdbcTableScan(table=[[adhoc, _order]])
 LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
   LogicalProject(exprs=[[$1, true]])
 JdbcTableScan(table=[[adhoc, order_position]])

I'd expect something along the lines of

LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE], 
flattened=[true])
   LogicalProject(inputs=[0])
   LogicalJoin(condition=[=($0, $1)], joinType=[semi])
 JdbcTableScan(table=[[adhoc, _order]])
 JdbcTableScan(table=[[adhoc, order_position]])

and for NOT EXISTS

LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE], 
flattened=[true])
   LogicalProject(inputs=[0])
   LogicalJoin(condition=[=($0, $1)], joinType=[anti])
 JdbcTableScan(table=[[adhoc, _order]])
 JdbcTableScan(table=[[adhoc, order_position]])

Am I missing something and the current aggregate function translation 
makes sense?

I constructed relational algebra structures for some other statements 
with SEMI- and ANTI-joins and already noticed that these join types 
weren't handled in 
org.apache.calcite.rel.rel2sql.RelToSqlConverter#visit(org.apache.calcite.rel.core.Join),
 
which I fixed locally. Is the lack of a translation intentional?

Is such a translation of SEMI- and ANTI-join to EXISTS and NOT EXISTS an 
over-simplification or would you say it's correct? As far as I 
understood from https://en.wikipedia.org/wiki/Relational_algebra this is 
correct.

I'd be happy to contribute that back. I didn't look into the Sql-to-Rel 
translation for EXISTS and NOT EXISTS to SEMI- and ANTI-join yet, but I 
assume that's not that hard and I could add that.

Regards,

Christian

Re: How to trace a column back to its original column

2020-02-19 Thread Rui Wang

If you have to traverse the whole AST/Rel tree to find original columns,
this thread discussed that before:
https://lists.apache.org/thread.html/11b66dd7f389f0b6e6bce54ad2d7b49a8a0bdd3be4784c441bfafb81%40%3Cdev.calcite.apache.org%3E



-Rui

On Wed, Feb 19, 2020 at 7:16 AM Seliverstov Igor 
wrote:

> This case each NID column is a calculated value - origin erases.
>
> You may try to infer origin walking through SQL AST recursively
>
> I can’t come up with another way.
>
> Regards,
> Igor
>
> > 19 февр. 2020 г., в 15:02, JiaTao Tao  написал(а):
> >
> > Thanks a lot
> >
> > But I found if the select is from the subquery, `getFieldOrigins`
> returns null, because you can not find the table(see in the pic.).
> > String sql = "SELECT T1.NAME ,\n"
> > + "   T1.NID,\n"
> > + "   T2.NID,\n"
> > + "   T2.COMPANY\n"
> > + "FROM\n"
> > + "  (SELECT (U.ID  +100) AS NID,\n"
> > + "  U.NAME  AS NAME\n"
> > + "   FROM USERS U) T1\n"
> > + "JOIN\n"
> > + "  (SELECT (J.ID  +100) AS NID,\n"
> > + "  J.COMPANY\n"
> > + "   FROM JOBS J) T2 ON T1.NID = T2.NID";
> >
> >
> > Regards!
> > Aron Tao
> >
> >
> > Seliverstov Igor mailto:gvvinbl...@gmail.com>>
> 于2020年2月19日周三 下午5:38写道：
> > Aron Tao,
> >
> > I think you need the next two methods:
> >
> > org.apache.calcite.sql.validate.SqlValidatorImpl#getFieldOrigins
> > java.sql.ResultSetMetaData#getTableName
> >
> > Regards,
> > Igor
> >
> > > 19 февр. 2020 г., в 10:29, JiaTao Tao  taojia...@gmail.com>> написал(а):
> > >
> > > Hi
> > > Thanks a lot.
> > > Can you give more details, say which method/field?
> > >
> > > Regards!
> > >
> > > Aron Tao
> > >
> > >
> > > Seliverstov Igor mailto:gvvinbl...@gmail.com>>
> 于2020年2月19日周三 下午2:31写道：
> > >
> > >> You can use their origins (says where a column came from).
> > >>
> > >> It's accessable from SqlValidatorImpl or jdbc result set
> > >>
> > >> ср, 19 февр. 2020 г., 9:25 JiaTao Tao  t...@apache.org>>:
> > >>
> > >>> What I really need is to collect every part of the SQL, Which
> columns are
> > >>> used as filters, which are used as projection(Columns on the source
> > >>> table), But
> > >>> the existence of nested subqueries complicates the issue.
> > >>>
> > >>>
> > >>> Regards!
> > >>>
> > >>> Aron Tao
> > >>>
> > >>>
> > >>> JiaTao Tao mailto:t...@apache.org>> 于2020年2月19日周三
> 下午2:17写道：
> > >>>
> >  SQL like this:
> >  ```
> >  SELECT T1.NID,
> >    T1.NAME ,
> >    T2.COMPANY
> >  FROM
> >   (SELECT (U.ID  +100) AS NID,
> >   U.NAME  AS NAME
> >    FROM USERS U) T1
> >  JOIN
> >   (SELECT (J.ID  +100) AS NID,
> >   J.COMPANY
> >    FROM JOBS J) T2 ON T1.NID = T2.NID
> >  ```
> >  What I wanted is that T1.NID comes from USER.ID ,
> T1.NAME  comes from
> >  USER.NAME .
> > 
> >  Has anyone done similar work? Is there a ready-made example to refer
> > >> to?
> > 
> > 
> >  Regards!
> > 
> >  Aron Tao
> > 
> > >>>
> > >>
> >
>
>

Re: Translation of SQL EXISTS

2020-02-19 Thread Christian Beikov


Hey Haisheng,

it is nice to have a rule that detects such patterns but do you agree 
that it would be better to generate SEMI- and ANTI-joins already during 
Sql-to-Rel conversion?


I implemented the Rel-to-Sql conversion for SEMI- and ANTI-join and it 
seems to work for my use cases, but it feels a bit ugly as I have to do 
some transformations which might be ad-hoc. I had to replace identifiers 
that refer to the outcome of the SEMI- and ANTI-joins with the select 
items of the LHS because the SEMI- and ANTI-join does not get an alias.


See for yourself: 
https://github.com/beikov/calcite/commit/6ca67e2e8df44c2081e71407ee6893a961f36271


I'll add a few tests and a comment, but could anyone please comment on 
whether this approach looks ok?


Regards,

Christian

Am 19.02.2020 um 19:03 schrieb Haisheng Yuan:

Hi Christian,

For the query in your example, Calcite first generates inner join plan 
with aggregate child, then through SemJoinRule transform the inner 
join to semi or antisemi join. The reason to have inner join is that 
it allows join commutativity, which is good for generating a potential 
better plan with nestedloop join or hash join.


Admittedly, this process in Calcite is counter intuitive. It should be 
in reverse order, first generate a semi or anti-semi join, then 
generate an inner/outer join.


- Haisheng

--
发件人：Christian Beikov
日　期：2020年02月19日 21:12:13
收件人：
主　题：Translation of SQL EXISTS

Hello,

I'm a bit confused about how the SQL EXISTS predicate is translated. I'd

assume that an EXISTS is translated in relational algebra to a SEMI- and

NOT EXISTS to an ANTI-join, but it's not.

PlannerImpl p = new PlannerImpl(config);
SqlNode sqlNode = p.parse("delete from _order o where exists (select 1

from order_position p where o.id = p.order_id)");
p.validate(sqlNode);
RelRoot rel = p.rel(sqlNode);
RelToSqlConverter sqlConverter = new RelToSqlConverter(dialect);
SqlImplementor.Result result = sqlConverter.visitChild(0, rel.rel);
sqlWriter.format(result.asStatement());

Worse, when printing this, I only get DELETE FROM "public"."_order" i.e.

the EXISTS part is not rendered. This is the plan I get.

LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE],
flattened=[true])
   LogicalProject(inputs=[0])
 LogicalProject(inputs=[0], exprs=[[CAST($1):BIGINT, CAST($2):BOOLEAN]])
   LogicalJoin(condition=[=($0, $1)], joinType=[inner])
 JdbcTableScan(table=[[adhoc, _order]])
 LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
   LogicalProject(exprs=[[$1, true]])
 JdbcTableScan(table=[[adhoc, order_position]])

I'd expect something along the lines of

LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE],
flattened=[true])
   LogicalProject(inputs=[0])
   LogicalJoin(condition=[=($0, $1)], joinType=[semi])
 JdbcTableScan(table=[[adhoc, _order]])
 JdbcTableScan(table=[[adhoc, order_position]])

and for NOT EXISTS

LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE],
flattened=[true])
   LogicalProject(inputs=[0])
   LogicalJoin(condition=[=($0, $1)], joinType=[anti])
 JdbcTableScan(table=[[adhoc, _order]])
 JdbcTableScan(table=[[adhoc, order_position]])

Am I missing something and the current aggregate function translation
makes sense?

I constructed relational algebra structures for some other statements
with SEMI- and ANTI-joins and already noticed that these join types
weren't handled in

org.apache.calcite.rel.rel2sql.RelToSqlConverter#visit(org.apache.calcite.rel.core.Join),

which I fixed locally. Is the lack of a translation intentional?

Is such a translation of SEMI- and ANTI-join to EXISTS and NOT EXISTS an

over-simplification or would you say it's correct? As far as I
understood from https://en.wikipedia.org/wiki/Relational_algebra this is

correct.

I'd be happy to contribute that back. I didn't look into the Sql-to-Rel

translation for EXISTS and NOT EXISTS to SEMI- and ANTI-join yet, but I

assume that's not that hard and I could add that.

Regards,

Christian

Re: Re: Translation of SQL EXISTS

2020-02-19 Thread Haisheng Yuan

Hi Christian,

Yes, I do agree it is better to generate SEMI/ANTI during Sql-to-Rel conversion 
if possible.

With regard to the missing Rel-to-Sql conversion for SEMI/ANTI joins, you can 
log a JIRA and open a pull request for it, people are happy to take a look.

- Haisheng

--
发件人：Christian Beikov
日　期：2020年02月20日 03:11:02
收件人：Haisheng Yuan; Apache Calcite dev 
list
主　题：Re: Translation of SQL EXISTS


Hey Haisheng,
it is nice to have a rule that detects such patterns but do you agree that it 
would be better to generate SEMI- and ANTI-joins already during Sql-to-Rel 
conversion?
I implemented the Rel-to-Sql conversion for SEMI- and ANTI-join and it seems to 
work for my use cases, but it feels a bit ugly as I have to do some 
transformations which might be ad-hoc. I had to replace identifiers that refer 
to the outcome of the SEMI- and ANTI-joins with the select items of the LHS 
because the SEMI- and ANTI-join does not get an alias.
See for yourself: 
https://github.com/beikov/calcite/commit/6ca67e2e8df44c2081e71407ee6893a961f36271
I'll add a few tests and a comment, but could anyone please comment on whether 
this approach looks ok?
Regards,
Christian
Am 19.02.2020 um 19:03 schrieb Haisheng Yuan:
 
Hi Christian,

For the query in your example, Calcite first generates inner join plan with 
aggregate child, then through SemJoinRule transform the inner join to semi or 
antisemi join. The reason to have inner join is that it allows join 
commutativity, which is good for generating a potential better plan with 
nestedloop join or hash join.

Admittedly, this process in Calcite is counter intuitive. It should be in 
reverse order, first generate a semi or anti-semi join, then generate an 
inner/outer join.

- Haisheng

--
 发件人：Christian Beikov
 日　期：2020年02月19日 21:12:13
 收件人：
 主　题：Translation of SQL EXISTS

 Hello,

 I'm a bit confused about how the SQL EXISTS predicate is translated. I'd 
 assume that an EXISTS is translated in relational algebra to a SEMI- and 
 NOT EXISTS to an ANTI-join, but it's not.

 PlannerImpl p = new PlannerImpl(config);
 SqlNode sqlNode = p.parse("delete from _order o where exists (select 1 
 from order_position p where o.id = p.order_id)");
 p.validate(sqlNode);
 RelRoot rel = p.rel(sqlNode);
 RelToSqlConverter sqlConverter = new RelToSqlConverter(dialect);
 SqlImplementor.Result result = sqlConverter.visitChild(0, rel.rel);
 sqlWriter.format(result.asStatement());

 Worse, when printing this, I only get DELETE FROM "public"."_order" i.e. 
 the EXISTS part is not rendered. This is the plan I get.

 LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE], 
 flattened=[true])
LogicalProject(inputs=[0])
  LogicalProject(inputs=[0], exprs=[[CAST($1):BIGINT, CAST($2):BOOLEAN]])
LogicalJoin(condition=[=($0, $1)], joinType=[inner])
  JdbcTableScan(table=[[adhoc, _order]])
  LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
LogicalProject(exprs=[[$1, true]])
  JdbcTableScan(table=[[adhoc, order_position]])

 I'd expect something along the lines of

 LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE], 
 flattened=[true])
LogicalProject(inputs=[0])
LogicalJoin(condition=[=($0, $1)], joinType=[semi])
  JdbcTableScan(table=[[adhoc, _order]])
  JdbcTableScan(table=[[adhoc, order_position]])

 and for NOT EXISTS

 LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE], 
 flattened=[true])
LogicalProject(inputs=[0])
LogicalJoin(condition=[=($0, $1)], joinType=[anti])
  JdbcTableScan(table=[[adhoc, _order]])
  JdbcTableScan(table=[[adhoc, order_position]])

 Am I missing something and the current aggregate function translation 
 makes sense?

 I constructed relational algebra structures for some other statements 
 with SEMI- and ANTI-joins and already noticed that these join types 
 weren't handled in 
 
org.apache.calcite.rel.rel2sql.RelToSqlConverter#visit(org.apache.calcite.rel.core.Join),
 
 which I fixed locally. Is the lack of a translation intentional?

 Is such a translation of SEMI- and ANTI-join to EXISTS and NOT EXISTS an 
 over-simplification or would you say it's correct? As far as I 
 understood from https://en.wikipedia.org/wiki/Relational_algebra this is 
 correct.

 I'd be happy to contribute that back. I didn't look into the Sql-to-Rel 
 translation for EXISTS and NOT EXISTS to SEMI- and ANTI-join yet, but I 
 assume that's not that hard and I could add that.

 Regards,

 Christian

Re: Translation of SQL EXISTS

2020-02-19 Thread Julian Hyde

Agreed, it should be in reverse order. Translate to semi-join (or
anti-join for NOT EXISTS), then optionally use a rule to rewrite semi-
or anti-join to Join+Aggregate.

Note that if the EXISTS is in a disjunction (e.g. "delete from orders
where exists (select null from order_items where ...) or shipping >
20") we cannot use a semi-join. We have to use a left join, using an
indicator column on the right-hand side that will be null iff there is
no match. Which is what we do currently.

On Wed, Feb 19, 2020 at 10:03 AM Haisheng Yuan  wrote:
>
> Hi Christian,
>
> For the query in your example, Calcite first generates inner join plan with 
> aggregate child, then through SemJoinRule transform the inner join to semi or 
> antisemi join. The reason to have inner join is that it allows join 
> commutativity, which is good for generating a potential better plan with 
> nestedloop join or hash join.
>
> Admittedly, this process in Calcite is counter intuitive. It should be in 
> reverse order, first generate a semi or anti-semi join, then generate an 
> inner/outer join.
>
> - Haisheng
>
> --
> 发件人：Christian Beikov
> 日　期：2020年02月19日 21:12:13
> 收件人：
> 主　题：Translation of SQL EXISTS
>
> Hello,
>
> I'm a bit confused about how the SQL EXISTS predicate is translated. I'd
> assume that an EXISTS is translated in relational algebra to a SEMI- and
> NOT EXISTS to an ANTI-join, but it's not.
>
> PlannerImpl p = new PlannerImpl(config);
> SqlNode sqlNode = p.parse("delete from _order o where exists (select 1
> from order_position p where o.id = p.order_id)");
> p.validate(sqlNode);
> RelRoot rel = p.rel(sqlNode);
> RelToSqlConverter sqlConverter = new RelToSqlConverter(dialect);
> SqlImplementor.Result result = sqlConverter.visitChild(0, rel.rel);
> sqlWriter.format(result.asStatement());
>
> Worse, when printing this, I only get DELETE FROM "public"."_order" i.e.
> the EXISTS part is not rendered. This is the plan I get.
>
> LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE],
> flattened=[true])
>LogicalProject(inputs=[0])
>  LogicalProject(inputs=[0], exprs=[[CAST($1):BIGINT, CAST($2):BOOLEAN]])
>LogicalJoin(condition=[=($0, $1)], joinType=[inner])
>  JdbcTableScan(table=[[adhoc, _order]])
>  LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
>LogicalProject(exprs=[[$1, true]])
>  JdbcTableScan(table=[[adhoc, order_position]])
>
> I'd expect something along the lines of
>
> LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE],
> flattened=[true])
>LogicalProject(inputs=[0])
>LogicalJoin(condition=[=($0, $1)], joinType=[semi])
>  JdbcTableScan(table=[[adhoc, _order]])
>  JdbcTableScan(table=[[adhoc, order_position]])
>
> and for NOT EXISTS
>
> LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE],
> flattened=[true])
>LogicalProject(inputs=[0])
>LogicalJoin(condition=[=($0, $1)], joinType=[anti])
>  JdbcTableScan(table=[[adhoc, _order]])
>  JdbcTableScan(table=[[adhoc, order_position]])
>
> Am I missing something and the current aggregate function translation
> makes sense?
>
> I constructed relational algebra structures for some other statements
> with SEMI- and ANTI-joins and already noticed that these join types
> weren't handled in
> org.apache.calcite.rel.rel2sql.RelToSqlConverter#visit(org.apache.calcite.rel.core.Join),
> which I fixed locally. Is the lack of a translation intentional?
>
> Is such a translation of SEMI- and ANTI-join to EXISTS and NOT EXISTS an
> over-simplification or would you say it's correct? As far as I
> understood from https://en.wikipedia.org/wiki/Relational_algebra this is
> correct.
>
> I'd be happy to contribute that back. I didn't look into the Sql-to-Rel
> translation for EXISTS and NOT EXISTS to SEMI- and ANTI-join yet, but I
> assume that's not that hard and I could add that.
>
> Regards,
>
> Christian
>
>

[jira] [Created] (CALCITE-3807) checkForSatisfiedConverters() is unnecessary

2020-02-19 Thread Xiening Dai (Jira)

Xiening Dai created CALCITE-3807:


 Summary: checkForSatisfiedConverters() is unnecessary 
 Key: CALCITE-3807
 URL: https://issues.apache.org/jira/browse/CALCITE-3807
 Project: Calcite
  Issue Type: Bug
Reporter: Xiening Dai


When VolcanoPlanner registers an abstract converter, it adds the converter into 
set.abstractConverters list, then calls checkSatisfiedConverter() to see if any 
converter is satisfied and can be remove from the list. But for every abstract 
converter, it always satisfies itself (changeTraitsUsingConverters() returns 
itself). Basically the converter would be removed from the list right after 
it's added. So this check is completely unnecessary and it slows down the 
planner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: How to trace a column back to its original column

2020-02-19 Thread JiaTao Tao

Hi Igor
org.apache.calcite.rel.metadata.RelMetadataQuery#getColumnOrigins may help

Regards!

Aron Tao


Seliverstov Igor  于2020年2月19日周三 下午11:16写道：

> This case each NID column is a calculated value - origin erases.
>
> You may try to infer origin walking through SQL AST recursively
>
> I can’t come up with another way.
>
> Regards,
> Igor
>
> > 19 февр. 2020 г., в 15:02, JiaTao Tao  написал(а):
> >
> > Thanks a lot
> >
> > But I found if the select is from the subquery, `getFieldOrigins`
> returns null, because you can not find the table(see in the pic.).
> > String sql = "SELECT T1.NAME ,\n"
> > + "   T1.NID,\n"
> > + "   T2.NID,\n"
> > + "   T2.COMPANY\n"
> > + "FROM\n"
> > + "  (SELECT (U.ID  +100) AS NID,\n"
> > + "  U.NAME  AS NAME\n"
> > + "   FROM USERS U) T1\n"
> > + "JOIN\n"
> > + "  (SELECT (J.ID  +100) AS NID,\n"
> > + "  J.COMPANY\n"
> > + "   FROM JOBS J) T2 ON T1.NID = T2.NID";
> >
> >
> > Regards!
> > Aron Tao
> >
> >
> > Seliverstov Igor mailto:gvvinbl...@gmail.com>>
> 于2020年2月19日周三 下午5:38写道：
> > Aron Tao,
> >
> > I think you need the next two methods:
> >
> > org.apache.calcite.sql.validate.SqlValidatorImpl#getFieldOrigins
> > java.sql.ResultSetMetaData#getTableName
> >
> > Regards,
> > Igor
> >
> > > 19 февр. 2020 г., в 10:29, JiaTao Tao  taojia...@gmail.com>> написал(а):
> > >
> > > Hi
> > > Thanks a lot.
> > > Can you give more details, say which method/field?
> > >
> > > Regards!
> > >
> > > Aron Tao
> > >
> > >
> > > Seliverstov Igor mailto:gvvinbl...@gmail.com>>
> 于2020年2月19日周三 下午2:31写道：
> > >
> > >> You can use their origins (says where a column came from).
> > >>
> > >> It's accessable from SqlValidatorImpl or jdbc result set
> > >>
> > >> ср, 19 февр. 2020 г., 9:25 JiaTao Tao  t...@apache.org>>:
> > >>
> > >>> What I really need is to collect every part of the SQL, Which
> columns are
> > >>> used as filters, which are used as projection(Columns on the source
> > >>> table), But
> > >>> the existence of nested subqueries complicates the issue.
> > >>>
> > >>>
> > >>> Regards!
> > >>>
> > >>> Aron Tao
> > >>>
> > >>>
> > >>> JiaTao Tao mailto:t...@apache.org>> 于2020年2月19日周三
> 下午2:17写道：
> > >>>
> >  SQL like this:
> >  ```
> >  SELECT T1.NID,
> >    T1.NAME ,
> >    T2.COMPANY
> >  FROM
> >   (SELECT (U.ID  +100) AS NID,
> >   U.NAME  AS NAME
> >    FROM USERS U) T1
> >  JOIN
> >   (SELECT (J.ID  +100) AS NID,
> >   J.COMPANY
> >    FROM JOBS J) T2 ON T1.NID = T2.NID
> >  ```
> >  What I wanted is that T1.NID comes from USER.ID ,
> T1.NAME  comes from
> >  USER.NAME .
> > 
> >  Has anyone done similar work? Is there a ready-made example to refer
> > >> to?
> > 
> > 
> >  Regards!
> > 
> >  Aron Tao
> > 
> > >>>
> > >>
> >
>
>

Re: limitations on the SQLs executed

2020-02-19 Thread Yang Liu

Hi all,

Dive into docs and I find the getJoinRowCount

seems the very thing I am looking for, I can reject the join when the rows
are too many. Any ideas?

Yang Liu  于2020年2月17日周一 下午5:13写道：

> Thanks Muhammad, can help give more detailed description?
> Currently I have searched a RefOptCost, is this the one you are referring
> to?
> And I need to clarify, I do not mind the cost happened in datasources
> since that will not affect my application, may just take a longer time to
> get the result if the cost is high.
> I care about the cost in our application and afraid of OOM.
>
> Thanks
>
> Muhammad Gelbana  于2020年2月17日周一 上午4:26写道：
>
>> If your only concern is about memory utilization, I would try estimating
>> this using the plan's cost. But I guess you'll have run some tests to
>> estimate the ranges you can accept.
>>
>>
>> On Sun, Feb 16, 2020 at 5:50 PM Yang Liu  wrote:
>>
>> > Is it possible to have some limitations on the SQLs to make sure our
>> > application which depends on Calcite is "safe"? For example, when do
>> merge
>> > joining between 2 large datasources, our application maybe OOM since the
>> > joining process is in memory. If we have the "limitation mechanism", we
>> can
>> > refuse to execute the joining to avoid OOM.
>> >
>> > Or we can only do the check outside Calcite?
>> >
>> > Thanks
>> >
>>
>

Calcite-Master - Build # 1615 - Failure

2020-02-19 Thread Apache Jenkins Server

The Apache Jenkins build system has built Calcite-Master (build #1615)

Status: Failure

Check console output at https://builds.apache.org/job/Calcite-Master/1615/ to 
view the results.

Calcite equivalent for Select CONVERT date formatter

2020-02-19 Thread Suresh Krishnan

Hi Team,

We are having trouble finding the right approach to convert the date fields 
using calcite.

We are looking for something that will give us the below SQL equivalent
SELECT CONVERT(varchar(12), GETDATE(), 101) - 06/29/2009

We have tried creating a rexNode using the below method. But it didn't yield 
the expected result.
RelDataType type = builder.getTypeFactory().createSqlType(SqlTypeName.DATE);
RexNode rexNode2 = builder.getRexBuilder().makeCall(type, 
SqlStdOperatorTable.CONVERT, rexNodes);


Could you please help with this request?

Thanks,
Suresh Krishnan
This message, together with any attachments, is intended only for the use of 
the individual or entity to which it is addressed and may contain confidential 
and/or privileged information. If you are not the intended recipient(s), or the 
employee or agent responsible for delivery of this message to the intended 
recipient(s), you are hereby notified that any dissemination, distribution or 
copying of this message, or any attachment, is strictly prohibited. If you have 
received this message in error, please immediately notify the sender and delete 
the message, together with any attachments, from your computer. Thank you for 
your cooperation.

[DISCUSS] Towards Calcite 1.22.0

2020-02-19 Thread Danny Chan

Hi all,

Approximately 5 months has passed from the previous release (Calcite 1.21.0) 
and I was thinking that it is the time to have the next release by the end of 
February. To do this I think we should try to have an RC around the 24 of 
February(the next Monday).

The progress towards the next release can be seen in [1].
I have made some pre-edit of the issues, we do not have many issues marked for 
fixing in version 1.22.0.

Because it is a long-overdue, I would suggest we only concentrate on solving 
these issues in the following days,
I’m not saying we are ignoring the existing PRs, but we did need a release now 
that should happen 2 months ago.

If you really think the issue/PR should be in release-1.22.0, please comment in 
this thread. I’m planning to pull the PC0 on
24 of February, thanks ~

Don't hesitate to reply to this thread if the plan above is not convenient
for you!

[1]
https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12333950

Best,
Danny Chan

[DISCUSS] Towards Calcite 1.22.0

2020-02-19 Thread Danny Chan

Hi all,

Approximately 5 months have passed from the previous release
(Calcite 1.21.0) and I was thinking that it is the time to have the next
release by the end of February. To do this I think we should try to have an
RC around the 24 of February(the next Monday).

The progress towards the next release can be seen in [1].
I have made some pre-edit of the issues, we do not have many issues marked
for fixing in version 1.22.0.

Because it is a long-overdue, I would suggest we only concentrate on
solving these issues in the following days,
I’m not saying we are ignoring the existing PRs, but we did need a release
now that should happen 2 months ago.

If you really think the issue/PR should be into release-1.22.0, please
comment in this thread. I’m planning to pull the RC0 on
24 of February, thanks ~

Don't hesitate to reply to this thread if the plan above is not convenient
for you!

[1]
https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12333950

Best,
Danny Chan

How to optimize repeated RelNode Structures? (CALCITE-3806)

2020-02-19 Thread Anjali Shrishrimal

Hi everybody,

I would like to have your suggestions on CALCITE-3806.

Asking it here as suggested by Julian.





If RelNode tree contains a subtree whose result can be obtained by some other 
part of the same tree,

can we optimize it ? and how to express it in plan ?



For example,

Let's say input structure looks like this :



LogicalUnion(all=[true])

  LogicalProject(EMPNO=[$0])

LogicalFilter(condition=[>=($0, 7369)])

  LogicalTableScan(table=[[scott, EMP]])

  LogicalProject(EMPNO=[$0])

LogicalFilter(condition=[>=($0, 7369)])

  LogicalTableScan(table=[[scott, EMP]])





In this case,



  LogicalProject(EMPNO=[$0])

LogicalFilter(condition=[>=($0, 7369)])

  LogicalTableScan(table=[[scott, EMP]])



is repeated. It is going to fetch same data twice.

Can we save one fetch? Can we somehow tell 2nd input of union to make use of 
union's 1st input. Is there any way to express that in plan?



Also,
If the structure was like this :



LogicalUnion(all=[true])

  LogicalProject(EMPNO=[$0])

LogicalFilter(condition=[>=($0, 7369)])

  LogicalTableScan(table=[[scott, EMP]])

  LogicalProject(EMPNO=[$0])

LogicalFilter(condition=[>=($0, 8000)])

  LogicalTableScan(table=[[scott, EMP]])



Second part of union can perform filtering on fetched data of 1st part. (As 
second's output is subset of first's output)



Does calcite provide such kind of optimizations ?

If not, what are the challenges to do so?







Would love to hear your thoughts.




Thank you,
Anjali Shrishrimal

[DISCUSS] Towards Calcite 1.22.0

2020-02-19 Thread Danny Chan

Hi all,

Approximately 5 months have passed from the previous release
(Calcite 1.21.0) and I was thinking that it is the time to have the next
release by the end of February. To do this I think we should try to have an
RC around the 24 of February(the next Monday).

The progress towards the next release can be seen in [1].
I have made some pre-edit of the issues, we do not have many issues marked
for fixing in version 1.22.0.

Because it is a long-overdue, I would suggest we only concentrate on
solving these issues in the following days,
I’m not saying we are ignoring the existing PRs, but we did need a release
now that should happen 2 months ago.

If you really think the issue/PR should be into release-1.22.0, please
comment in this thread. I’m planning to pull the RC0 on
24 of February, thanks ~

Don't hesitate to reply to this thread if the plan above is not convenient
for you!

[1]
https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12333950

Best,
Danny Chan

Parse PostgreSQL query using Calcite-BABEL

Re: Correlation with a variable

Re: How to trace a column back to its original column

[jira] [Created] (CALCITE-3806) How to optimize repeated RelNode Structures ?

Re: How to trace a column back to its original column

Re: How to trace a column back to its original column

Translation of SQL EXISTS

Build failed in Jenkins: Calcite-Snapshots #115

Re: How to trace a column back to its original column

Re: Correlation with a variable

Re: Translation of SQL EXISTS

Re: How to trace a column back to its original column

Re: Translation of SQL EXISTS

Re: Re: Translation of SQL EXISTS

Re: Translation of SQL EXISTS

[jira] [Created] (CALCITE-3807) checkForSatisfiedConverters() is unnecessary

Re: How to trace a column back to its original column

Re: limitations on the SQLs executed

Calcite-Master - Build # 1615 - Failure

Calcite equivalent for Select CONVERT date formatter

[DISCUSS] Towards Calcite 1.22.0

[DISCUSS] Towards Calcite 1.22.0

How to optimize repeated RelNode Structures? (CALCITE-3806)

[DISCUSS] Towards Calcite 1.22.0

24 matches

Site Navigation

Mail list logo

Footer information