How to keep quotes in SqlIdentifier?

2019-12-13 Thread ????????????
Hi,
IgottheSqlNodebyusingSqlParse.parseQuery()(Sqlbellow),butIcannotfindtheback_tickinSqlIdentifer

Sql:

SELECT`sql`,id1FROMtestdata

DebugwithIDEA:

Re: CI passed error tests

2019-12-13 Thread XING JIN
Thanks a lot, Stamatis~
I will run slow tests from my local.

Best,
Jin

在 2019年12月14日星期六,Stamatis Zampetakis  写道:

> Hi Jin,
>
> If the test run fine locally at your machine (e.g., ./gradlew testSlow)
> then they should be fine in the CI.
> Labels are only available for committers yes!
>
> Best,
> Stamatis
>
> On Fri, Dec 13, 2019 at 4:52 PM XING JIN  wrote:
>
> > Thanks a lot Ruben ~
> > I'm working on "[CALCITE-3478] Restructure tests for materialized views"
> (
> > https://github.com/apache/calcite/pull/1560) , which contains "slow
> > tests".
> > But I'm not sure how to add the label to PR. Is it only available for
> > committers ?
> > BTW, It's great if someone can help review
> > https://github.com/apache/calcite/pull/1560 . I've rebased the PR to
> > resolve conflicts for several times.
> >
> > Best,
> > Jin
> >
> > Ruben Q L  于2019年12月13日周五 下午4:00写道:
> >
> > > Hi Jin,
> > >
> > > this is the expected behavior: slow tests are not executed by default
> in
> > PR
> > > CI.
> > > If you want slow tests to be executed in your PR, you need to
> explicitly
> > > add the label "slow-tests-needed", as specified by [1].
> > > You can see a PR example with this label here [2].
> > >
> > > Best regards,
> > > Ruben
> > >
> > > [1] https://issues.apache.org/jira/browse/CALCITE-3141
> > > [2] https://github.com/apache/calcite/pull/1651
> > >
> > >
> > > Le ven. 13 déc. 2019 à 08:41, XING JIN  a
> > écrit :
> > >
> > > > Hi guys,
> > > > I made a PR and run continuous integration tests. [1]
> > > > A error test contained in the PR and tagged with @slowTest.
> > > > The tests should be failed but CI passed by mistake.
> > > > I doubt our current CI is not running with 'testSlow' configuration.
> > > Isn't
> > > > it ?
> > > > I'm not sure if I should create a JIRA.
> > > >
> > > > Best,
> > > > Jin
> > > >
> > > > [1] https://github.com/apache/calcite/pull/1653
> > > >
> > >
> >
>


Re: CI passed error tests

2019-12-13 Thread Stamatis Zampetakis
Hi Jin,

If the test run fine locally at your machine (e.g., ./gradlew testSlow)
then they should be fine in the CI.
Labels are only available for committers yes!

Best,
Stamatis

On Fri, Dec 13, 2019 at 4:52 PM XING JIN  wrote:

> Thanks a lot Ruben ~
> I'm working on "[CALCITE-3478] Restructure tests for materialized views" (
> https://github.com/apache/calcite/pull/1560) , which contains "slow
> tests".
> But I'm not sure how to add the label to PR. Is it only available for
> committers ?
> BTW, It's great if someone can help review
> https://github.com/apache/calcite/pull/1560 . I've rebased the PR to
> resolve conflicts for several times.
>
> Best,
> Jin
>
> Ruben Q L  于2019年12月13日周五 下午4:00写道:
>
> > Hi Jin,
> >
> > this is the expected behavior: slow tests are not executed by default in
> PR
> > CI.
> > If you want slow tests to be executed in your PR, you need to explicitly
> > add the label "slow-tests-needed", as specified by [1].
> > You can see a PR example with this label here [2].
> >
> > Best regards,
> > Ruben
> >
> > [1] https://issues.apache.org/jira/browse/CALCITE-3141
> > [2] https://github.com/apache/calcite/pull/1651
> >
> >
> > Le ven. 13 déc. 2019 à 08:41, XING JIN  a
> écrit :
> >
> > > Hi guys,
> > > I made a PR and run continuous integration tests. [1]
> > > A error test contained in the PR and tagged with @slowTest.
> > > The tests should be failed but CI passed by mistake.
> > > I doubt our current CI is not running with 'testSlow' configuration.
> > Isn't
> > > it ?
> > > I'm not sure if I should create a JIRA.
> > >
> > > Best,
> > > Jin
> > >
> > > [1] https://github.com/apache/calcite/pull/1653
> > >
> >
>


Re: CI passed error tests

2019-12-13 Thread XING JIN
Thanks a lot Ruben ~
I'm working on "[CALCITE-3478] Restructure tests for materialized views" (
https://github.com/apache/calcite/pull/1560) , which contains "slow tests".
But I'm not sure how to add the label to PR. Is it only available for
committers ?
BTW, It's great if someone can help review
https://github.com/apache/calcite/pull/1560 . I've rebased the PR to
resolve conflicts for several times.

Best,
Jin

Ruben Q L  于2019年12月13日周五 下午4:00写道:

> Hi Jin,
>
> this is the expected behavior: slow tests are not executed by default in PR
> CI.
> If you want slow tests to be executed in your PR, you need to explicitly
> add the label "slow-tests-needed", as specified by [1].
> You can see a PR example with this label here [2].
>
> Best regards,
> Ruben
>
> [1] https://issues.apache.org/jira/browse/CALCITE-3141
> [2] https://github.com/apache/calcite/pull/1651
>
>
> Le ven. 13 déc. 2019 à 08:41, XING JIN  a écrit :
>
> > Hi guys,
> > I made a PR and run continuous integration tests. [1]
> > A error test contained in the PR and tagged with @slowTest.
> > The tests should be failed but CI passed by mistake.
> > I doubt our current CI is not running with 'testSlow' configuration.
> Isn't
> > it ?
> > I'm not sure if I should create a JIRA.
> >
> > Best,
> > Jin
> >
> > [1] https://github.com/apache/calcite/pull/1653
> >
>


Calcite-Master - Build # 1497 - Failure

2019-12-13 Thread Apache Jenkins Server
The Apache Jenkins build system has built Calcite-Master (build #1497)

Status: Failure

Check console output at https://builds.apache.org/job/Calcite-Master/1497/ to 
view the results.

Re: [Discuss] Make flattening on Struct/Row optional

2019-12-13 Thread Igor Guzenko
Hi Rui,

I'm glad that the fix was useful.

Thanks,
Igor


On Thu, Dec 12, 2019 at 8:16 PM Rui Wang  wrote:

> Absolutely. Thanks lgor for the contribution! :)
>
>
> -Rui
>
> On Wed, Dec 11, 2019 at 10:54 PM Stamatis Zampetakis 
> wrote:
>
> > So basically thanks to Igor :)
> >
> > On Wed, Dec 11, 2019 at 9:56 PM Rui Wang  wrote:
> >
> > > Thanks Stamatis's suggestion. Indeed a recent effort in [1] enhanced
> the
> > > support that reconstructs ROW in the top SELECT, which is supposed to
> > solve
> > > the problem.
> > >
> > >
> > >
> > > [1]: https://jira.apache.org/jira/browse/CALCITE-3138
> > >
> > > On Mon, Dec 9, 2019 at 3:21 PM Rui Wang  wrote:
> > >
> > > > Hello,
> > > >
> > > > Sorry for the long delay on this thread. Recently I heard about
> > requests
> > > > on how to deal with STRUCT without flattening it again in BeamSQL.
> > Also I
> > > > realized Flink has already disabled it in their codebase[1]. I did
> try
> > to
> > > > remove STRUCT flattening and run unit tests of calcite core to see
> how
> > > many
> > > > tests breaks: it was 25, which wasn't that bad. So I would like to
> pick
> > > up
> > > > this effort again.
> > > >
> > > > Before I do it, I just want to ask if Calcite community supports this
> > > > effort (or think if it is a good idea)?
> > > >
> > > > My current execution plan will be the following:
> > > > 1. Add a new flag to FrameworkConfig to specify whether flattening
> > > STRUCT.
> > > > By default, it is yes.
> > > > 2. When disabling struct flatterner, add more tests to test STRUCT
> > > support
> > > > in general. For example, test STRUCT support on projection, join
> > > condition,
> > > > filtering, etc.  If there is something breaks, try to fix it.
> > > > 3. Check the 25 failed tests above and see why they have failed if
> > struct
> > > > flattener is gone. Duplicate those failed tests but have necessary
> > fixes
> > > to
> > > > make sure they can pass without STRUCT flattening.
> > > >
> > > >
> > > > [1]:
> > > >
> > >
> >
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/calcite/FlinkPlannerImpl.scala#L166
> > > >
> > > >
> > > > -Rui
> > > >
> > > > On Wed, Sep 5, 2018 at 11:59 AM Julian Hyde 
> wrote:
> > > >
> > > >> It might not be minor, but it’s worth a try. At optimization time we
> > > >> treat all fields as fields, regardless of whether they have complex
> > > types
> > > >> (maps, arrays, multisets, records) so there should not be too many
> > > >> problems. The flattening was mainly for the benefit of the runtime.
> > > >>
> > > >>
> > > >> > On Sep 5, 2018, at 11:32 AM, Rui Wang 
> > > >> wrote:
> > > >> >
> > > >> > Thanks for your helpful response! It seems like disabling the
> > > flattening
> > > >> > will at least affect some rules in optimization. It might not be a
> > > minor
> > > >> > change.
> > > >> >
> > > >> >
> > > >> > -Rui
> > > >> >
> > > >> > On Wed, Sep 5, 2018 at 4:54 AM Stamatis Zampetakis <
> > zabe...@gmail.com
> > > >
> > > >> > wrote:
> > > >> >
> > > >> >> Hi Rui,
> > > >> >>
> > > >> >> Disabling flattening in some cases seems reasonable.
> > > >> >>
> > > >> >> If I am not mistaken, even in the existing code it is not used
> all
> > > the
> > > >> time
> > > >> >> so it makes sense to become configurable.
> > > >> >> For example, Calcite prepared statements (CalcitePrepareImpl) are
> > > >> using the
> > > >> >> flattener only for DDL operations that create materialized views
> > (and
> > > >> this
> > > >> >> is because this code at some point passes from the PlannerImpl).
> > > >> >> On the other hand, any query that is using the Planner will also
> > pass
> > > >> from
> > > >> >> the flattener.
> > > >> >>
> > > >> >> Disabling the flattener does not mean that all rules will work
> > > without
> > > >> >> problems. The Javadoc of the RelStructuredTypeFlattener at some
> > point
> > > >> says
> > > >> >> "This approach has the benefit that real optimizer and codegen
> > rules
> > > >> never
> > > >> >> have to deal with structured types.". Due to this, it is very
> > likely
> > > >> that
> > > >> >> some rules were written based on the fact that there are no
> > > structured
> > > >> >> types.
> > > >> >>
> > > >> >> Best,
> > > >> >> Stamatis
> > > >> >>
> > > >> >>
> > > >> >> Στις Τετ, 5 Σεπ 2018 στις 9:48 π.μ., ο/η Julian Hyde <
> > > jh...@apache.org
> > > >> >
> > > >> >> έγραψε:
> > > >> >>
> > > >> >>> Flattening was introduced mainly because the original engine
> used
> > > flat
> > > >> >>> column-oriented storage. Now we have several ways to executing,
> > > >> >>> including generating java code.
> > > >> >>>
> > > >> >>> Adding a mode to disable flattening might make sense.
> > > >> >>> On Tue, Sep 4, 2018 at 12:52 PM Rui Wang
> >  > > >
> > > >> >>> wrote:
> > > >> 
> > > >>  Hi Community,
> > > >> 
> > > >>  While trying to support Row type in Apache Beam SQL on top of
> > > >> Calcite,
> > > >> >> I
> > > >>  

Re: Quicksql

2019-12-13 Thread Juan Pan
Thanks for your clarification, Haisheng.


I am curious how to join the tables from different datasources. 


Supposing there is tb1 in datasource1 and tb2 in datasource2 and the SQL is 
`select tb1.col1, tb2.col2 from tb1, tb2 where tb1.id = tb2.id`, how to join 
two of tables together and get the final result?


 Juan Pan (Trista) 
 
Senior DBA & PPMC of Apache ShardingSphere(Incubating)
E-mail: panj...@apache.org




On 12/12/2019 11:05,Haisheng Yuan wrote:
Nope, it doesn't use any adapters. It just submits partial SQL query to 
different engines.

If query contains table from single source, e.g.
select count(*) from hive_table1, hive_table2 where a=b;
then the whole query will be submitted to hive.

Otherwise, e.g.
select distinct a,b from hive_table union select distinct a,b from mysql_table;

The following query will be submitted to Spark and executed by Spark:
select a,b from spark_tmp_table1 union select a,b from spark_tmp_table2;

spark_tmp_table1: select distinct a,b from hive_table
spark_tmp_table2: select distinct a,b from mysql_table

On 2019/12/11 04:27:07, "Juan Pan"  wrote:
Hi Haisheng,


The query on different data source will then be registered as temp spark tables 
(with filter or join pushed in), the whole query is rewritten as SQL text over 
these temp tables and submitted to Spark.


Does it mean QuickSQL also need adaptors to make query executed on different 
data source?


Yes, virtualization is one of Calcite’s goals. In fact, when I created Calcite 
I was thinking about virtualization + in-memory materialized views. Not only 
the Spark convention but any of the “engine” conventions (Drill, Flink, Beam, 
Enumerable) could be used to create a virtual query engine.


Basically, i like and agree with Julian’s statement. It is a great idea which 
personally hope Calcite move towards.


Give my best wishes to Calcite community.


Thanks,
Trista


Juan Pan


panj...@apache.org
Juan Pan(Trista), Apache ShardingSphere


On 12/11/2019 10:53,Haisheng Yuan wrote:
As far as I know, users still need to register tables from other data sources 
before querying it. QuickSQL uses Calcite for parsing queries and optimizing 
logical expressions with several transformation rules. The query on different 
data source will then be registered as temp spark tables (with filter or join 
pushed in), the whole query is rewritten as SQL text over these temp tables and 
submitted to Spark.

- Haisheng

--
发件人:Rui Wang
日 期:2019年12月11日 06:24:45
收件人:
主 题:Re: Quicksql

The co-routine model sounds fitting into Streaming cases well.

I was thinking how should Enumerable interface work with streaming cases
but now I should also check Interpreter.


-Rui

On Tue, Dec 10, 2019 at 1:33 PM Julian Hyde  wrote:

The goal (or rather my goal) for the interpreter is to replace
Enumerable as the quick, easy default convention.

Enumerable is efficient but not that efficient (compared to engines
that work on off-heap data representing batches of records). And
because it generates java byte code there is a certain latency to
getting a query prepared and ready to run.

It basically implements the old Volcano query evaluation model. It is
single-threaded (because all work happens as a result of a call to
'next()' on the root node) and cannot handle branching data-flow
graphs (DAGs).

The Interpreter operates uses a co-routine model (reading from queues,
writing to queues, and yielding when there is no work to be done) and
therefore could be more efficient than enumerable in a single-node
multi-core system. Also, there is little start-up time, which is
important for small queries.

I would love to add another built-in convention that uses Arrow as
data format and generates co-routines for each operator. Those
co-routines could be deployed in a parallel and/or distributed data
engine.

Julian

On Tue, Dec 10, 2019 at 3:47 AM Zoltan Farkas
 wrote:

What is the ultimate goal of the Calcite Interpreter?

To provide some context, I have been playing around with calcite + REST
(see https://github.com/zolyfarkas/jaxrs-spf4j-demo/wiki/AvroCalciteRest <
https://github.com/zolyfarkas/jaxrs-spf4j-demo/wiki/AvroCalciteRest> for
detail of my experiments)


—Z

On Dec 9, 2019, at 9:05 PM, Julian Hyde  wrote:

Yes, virtualization is one of Calcite’s goals. In fact, when I created
Calcite I was thinking about virtualization + in-memory materialized views.
Not only the Spark convention but any of the “engine” conventions (Drill,
Flink, Beam, Enumerable) could be used to create a virtual query engine.

See e.g. a talk I gave in 2013 about Optiq (precursor to Calcite)
https://www.slideshare.net/julianhyde/optiq-a-dynamic-data-management-framework
<
https://www.slideshare.net/julianhyde/optiq-a-dynamic-data-management-framework
.

Julian



On Dec 9, 2019, at 2:29 PM, Muhammad Gelbana 
wrote:

I recently contacted one of the active contributors asking about the

Re: Quicksql

2019-12-13 Thread Juan Pan
Yes, indeed.


 Juan Pan (Trista) 
 
Senior DBA & PPMC of Apache ShardingSphere(Incubating)
E-mail: panj...@apache.org




On 12/12/2019 18:00,Alessandro Solimando wrote:
Adapters must be needed by data sources not supporting SQL, I think this is
what Juan Pan was asking for.

On Thu, 12 Dec 2019 at 04:05, Haisheng Yuan  wrote:

Nope, it doesn't use any adapters. It just submits partial SQL query to
different engines.

If query contains table from single source, e.g.
select count(*) from hive_table1, hive_table2 where a=b;
then the whole query will be submitted to hive.

Otherwise, e.g.
select distinct a,b from hive_table union select distinct a,b from
mysql_table;

The following query will be submitted to Spark and executed by Spark:
select a,b from spark_tmp_table1 union select a,b from spark_tmp_table2;

spark_tmp_table1: select distinct a,b from hive_table
spark_tmp_table2: select distinct a,b from mysql_table

On 2019/12/11 04:27:07, "Juan Pan"  wrote:
Hi Haisheng,


The query on different data source will then be registered as temp
spark tables (with filter or join pushed in), the whole query is rewritten
as SQL text over these temp tables and submitted to Spark.


Does it mean QuickSQL also need adaptors to make query executed on
different data source?


Yes, virtualization is one of Calcite’s goals. In fact, when I created
Calcite I was thinking about virtualization + in-memory materialized views.
Not only the Spark convention but any of the “engine” conventions (Drill,
Flink, Beam, Enumerable) could be used to create a virtual query engine.


Basically, i like and agree with Julian’s statement. It is a great idea
which personally hope Calcite move towards.


Give my best wishes to Calcite community.


Thanks,
Trista


Juan Pan


panj...@apache.org
Juan Pan(Trista), Apache ShardingSphere


On 12/11/2019 10:53,Haisheng Yuan wrote:
As far as I know, users still need to register tables from other data
sources before querying it. QuickSQL uses Calcite for parsing queries and
optimizing logical expressions with several transformation rules. The query
on different data source will then be registered as temp spark tables (with
filter or join pushed in), the whole query is rewritten as SQL text over
these temp tables and submitted to Spark.

- Haisheng

--
发件人:Rui Wang
日 期:2019年12月11日 06:24:45
收件人:
主 题:Re: Quicksql

The co-routine model sounds fitting into Streaming cases well.

I was thinking how should Enumerable interface work with streaming cases
but now I should also check Interpreter.


-Rui

On Tue, Dec 10, 2019 at 1:33 PM Julian Hyde  wrote:

The goal (or rather my goal) for the interpreter is to replace
Enumerable as the quick, easy default convention.

Enumerable is efficient but not that efficient (compared to engines
that work on off-heap data representing batches of records). And
because it generates java byte code there is a certain latency to
getting a query prepared and ready to run.

It basically implements the old Volcano query evaluation model. It is
single-threaded (because all work happens as a result of a call to
'next()' on the root node) and cannot handle branching data-flow
graphs (DAGs).

The Interpreter operates uses a co-routine model (reading from queues,
writing to queues, and yielding when there is no work to be done) and
therefore could be more efficient than enumerable in a single-node
multi-core system. Also, there is little start-up time, which is
important for small queries.

I would love to add another built-in convention that uses Arrow as
data format and generates co-routines for each operator. Those
co-routines could be deployed in a parallel and/or distributed data
engine.

Julian

On Tue, Dec 10, 2019 at 3:47 AM Zoltan Farkas
 wrote:

What is the ultimate goal of the Calcite Interpreter?

To provide some context, I have been playing around with calcite + REST
(see https://github.com/zolyfarkas/jaxrs-spf4j-demo/wiki/AvroCalciteRest
<
https://github.com/zolyfarkas/jaxrs-spf4j-demo/wiki/AvroCalciteRest> for
detail of my experiments)


—Z

On Dec 9, 2019, at 9:05 PM, Julian Hyde  wrote:

Yes, virtualization is one of Calcite’s goals. In fact, when I created
Calcite I was thinking about virtualization + in-memory materialized
views.
Not only the Spark convention but any of the “engine” conventions (Drill,
Flink, Beam, Enumerable) could be used to create a virtual query engine.

See e.g. a talk I gave in 2013 about Optiq (precursor to Calcite)

https://www.slideshare.net/julianhyde/optiq-a-dynamic-data-management-framework
<

https://www.slideshare.net/julianhyde/optiq-a-dynamic-data-management-framework
.

Julian



On Dec 9, 2019, at 2:29 PM, Muhammad Gelbana 
wrote:

I recently contacted one of the active contributors asking about the
purpose of the project and here's his reply:

From my understanding, Quicksql is a data virtualization platform. It
can
query multiple data sources 

Re: CI passed error tests

2019-12-13 Thread Ruben Q L
Hi Jin,

this is the expected behavior: slow tests are not executed by default in PR
CI.
If you want slow tests to be executed in your PR, you need to explicitly
add the label "slow-tests-needed", as specified by [1].
You can see a PR example with this label here [2].

Best regards,
Ruben

[1] https://issues.apache.org/jira/browse/CALCITE-3141
[2] https://github.com/apache/calcite/pull/1651


Le ven. 13 déc. 2019 à 08:41, XING JIN  a écrit :

> Hi guys,
> I made a PR and run continuous integration tests. [1]
> A error test contained in the PR and tagged with @slowTest.
> The tests should be failed but CI passed by mistake.
> I doubt our current CI is not running with 'testSlow' configuration. Isn't
> it ?
> I'm not sure if I should create a JIRA.
>
> Best,
> Jin
>
> [1] https://github.com/apache/calcite/pull/1653
>