Hi everyone: Allow me to introduce my good friend Siyuan Liu, who is the leader of Quicksql project.
I CC to him and ask him to introduce the project to us.Here is the documentation link for Quicksql [1]. [1]. https://quicksql.readthedocs.io/en/latest/ Regards, Francis Juan Pan <panj...@apache.org> 于2019年12月23日周一 上午11:44写道: > Thanks Gelbana, > > > Very appreciated your explanation, which sheds me some light on exploring > Calcite. :) > > > Best wishes, > Trista > > > Juan Pan (Trista) > > Senior DBA & PPMC of Apache ShardingSphere(Incubating) > E-mail: panj...@apache.org > > > > > On 12/22/2019 05:58,Muhammad Gelbana<m.gelb...@gmail.com> wrote: > I am curious how to join the tables from different datasources. > Based on Calcite's conventions concept, the Join operator and its input > operators should all have the same convention. If they don't, the > convention different from the Join operator's convention will have to > register a converter rule. This rule should produce an operator that only > converts from that convention to the Join operator's convention. > > This way the Join operator will be able to handle the data obtained from > its input operators because it understands the data structure. > > Thanks, > Gelbana > > > On Wed, Dec 18, 2019 at 5:08 AM Juan Pan <panj...@apache.org> wrote: > > Some updates. > > > Recently i took a look at their doc and source code, and found this > project uses SQL parsing and Relational algebra of Calcite to get query > plan, and also translates to spark SQL for joining different datasources, > or corresponding query for single datasource. > > > Although it copies many classes from Calcite, the idea of QuickSQL seems > some of interests, and code is succinct. > > > Best, > Trista > > > Juan Pan (Trista) > > Senior DBA & PPMC of Apache ShardingSphere(Incubating) > E-mail: panj...@apache.org > > > > > On 12/13/2019 17:16,Juan Pan<panj...@apache.org> wrote: > Yes, indeed. > > > Juan Pan (Trista) > > Senior DBA & PPMC of Apache ShardingSphere(Incubating) > E-mail: panj...@apache.org > > > > > On 12/12/2019 18:00,Alessandro Solimando<alessandro.solima...@gmail.com> > wrote: > Adapters must be needed by data sources not supporting SQL, I think this is > what Juan Pan was asking for. > > On Thu, 12 Dec 2019 at 04:05, Haisheng Yuan <hy...@apache.org> wrote: > > Nope, it doesn't use any adapters. It just submits partial SQL query to > different engines. > > If query contains table from single source, e.g. > select count(*) from hive_table1, hive_table2 where a=b; > then the whole query will be submitted to hive. > > Otherwise, e.g. > select distinct a,b from hive_table union select distinct a,b from > mysql_table; > > The following query will be submitted to Spark and executed by Spark: > select a,b from spark_tmp_table1 union select a,b from spark_tmp_table2; > > spark_tmp_table1: select distinct a,b from hive_table > spark_tmp_table2: select distinct a,b from mysql_table > > On 2019/12/11 04:27:07, "Juan Pan" <panj...@apache.org> wrote: > Hi Haisheng, > > > The query on different data source will then be registered as temp > spark tables (with filter or join pushed in), the whole query is rewritten > as SQL text over these temp tables and submitted to Spark. > > > Does it mean QuickSQL also need adaptors to make query executed on > different data source? > > > Yes, virtualization is one of Calcite’s goals. In fact, when I created > Calcite I was thinking about virtualization + in-memory materialized views. > Not only the Spark convention but any of the “engine” conventions (Drill, > Flink, Beam, Enumerable) could be used to create a virtual query engine. > > > Basically, i like and agree with Julian’s statement. It is a great idea > which personally hope Calcite move towards. > > > Give my best wishes to Calcite community. > > > Thanks, > Trista > > > Juan Pan > > > panj...@apache.org > Juan Pan(Trista), Apache ShardingSphere > > > On 12/11/2019 10:53,Haisheng Yuan<h.y...@alibaba-inc.com> wrote: > As far as I know, users still need to register tables from other data > sources before querying it. QuickSQL uses Calcite for parsing queries and > optimizing logical expressions with several transformation rules. The query > on different data source will then be registered as temp spark tables (with > filter or join pushed in), the whole query is rewritten as SQL text over > these temp tables and submitted to Spark. > > - Haisheng > > ------------------------------------------------------------------ > 发件人:Rui Wang<amaliu...@apache.org> > 日 期:2019年12月11日 06:24:45 > 收件人:<dev@calcite.apache.org> > 主 题:Re: Quicksql > > The co-routine model sounds fitting into Streaming cases well. > > I was thinking how should Enumerable interface work with streaming cases > but now I should also check Interpreter. > > > -Rui > > On Tue, Dec 10, 2019 at 1:33 PM Julian Hyde <jh...@apache.org> wrote: > > The goal (or rather my goal) for the interpreter is to replace > Enumerable as the quick, easy default convention. > > Enumerable is efficient but not that efficient (compared to engines > that work on off-heap data representing batches of records). And > because it generates java byte code there is a certain latency to > getting a query prepared and ready to run. > > It basically implements the old Volcano query evaluation model. It is > single-threaded (because all work happens as a result of a call to > 'next()' on the root node) and cannot handle branching data-flow > graphs (DAGs). > > The Interpreter operates uses a co-routine model (reading from queues, > writing to queues, and yielding when there is no work to be done) and > therefore could be more efficient than enumerable in a single-node > multi-core system. Also, there is little start-up time, which is > important for small queries. > > I would love to add another built-in convention that uses Arrow as > data format and generates co-routines for each operator. Those > co-routines could be deployed in a parallel and/or distributed data > engine. > > Julian > > On Tue, Dec 10, 2019 at 3:47 AM Zoltan Farkas > <zolyfar...@yahoo.com.invalid> wrote: > > What is the ultimate goal of the Calcite Interpreter? > > To provide some context, I have been playing around with calcite + REST > (see https://github.com/zolyfarkas/jaxrs-spf4j-demo/wiki/AvroCalciteRest > < > https://github.com/zolyfarkas/jaxrs-spf4j-demo/wiki/AvroCalciteRest> for > detail of my experiments) > > > —Z > > On Dec 9, 2019, at 9:05 PM, Julian Hyde <jh...@apache.org> wrote: > > Yes, virtualization is one of Calcite’s goals. In fact, when I created > Calcite I was thinking about virtualization + in-memory materialized > views. > Not only the Spark convention but any of the “engine” conventions (Drill, > Flink, Beam, Enumerable) could be used to create a virtual query engine. > > See e.g. a talk I gave in 2013 about Optiq (precursor to Calcite) > > > > https://www.slideshare.net/julianhyde/optiq-a-dynamic-data-management-framework > < > > > > https://www.slideshare.net/julianhyde/optiq-a-dynamic-data-management-framework > . > > Julian > > > > On Dec 9, 2019, at 2:29 PM, Muhammad Gelbana <mgelb...@apache.org> > wrote: > > I recently contacted one of the active contributors asking about the > purpose of the project and here's his reply: > > From my understanding, Quicksql is a data virtualization platform. It > can > query multiple data sources altogether and in a distributed way; > Say, you > can write a SQL with a MySql table join with an Elasticsearch table. > Quicksql can recognize that, and then generate Spark code, in which > it will > fetch the MySQL/ES data as a temporary table separately, and then > join them > in Spark. The execution is in Spark so it is totally distributed. > The user > doesn't need to aware of where the table is from. > > > I understand that the Spark convention Calcite has attempts to > achieve the > same goal, but it isn't fully implemented yet. > > > On Tue, Oct 29, 2019 at 9:43 PM Julian Hyde <jh...@apache.org> wrote: > > Anyone know anything about Quicksql? It seems to be quite a popular > project, and they have an internal fork of Calcite. > > https://github.com/Qihoo360/ <https://github.com/Qihoo360/> > > > > > > > https://github.com/Qihoo360/Quicksql/tree/master/analysis/src/main/java/org/apache/calcite > < > > > > > https://github.com/Qihoo360/Quicksql/tree/master/analysis/src/main/java/org/apache/calcite > > > Julian > > > > > > > > > >