Re: Quicksql

2020-03-02 Thread Siyuan Liu
Hi, everyone:

Glad to see a lot of old friends here. Quicksql is a project born in early
2019. It was designed to solve the problem of long and complex work flow in
the big data field with many data sources, many compute engines, and many
types of syntax. The core idea is `Connect All Data Sources with One Extra
Parsing Cost`.

Because it involves standard SQL parsing, we finally chose Calcite as the
parsing engine that has the best SQL compatibility. Thanks to the excellent
architecture and toolkits provided by Calcite, Quicksql has made some
extensions on this basis and made more logical plans Rich definitions
enable single data source and multi-source queries to be described. For
single data sources, an end-to-end connection query is directly
established, and for multiple data sources, logical plans are divided and
pushed down, final interpreted as the code of the compute engine (such as
Spark, Flink) with distributed computing capabilities for data merge.

Based on this design, Quicksql makes extensive use of the ability of
Calcite Adapter \ Dialect \ UDF to provide syntax adaptation compatibility
for various data sources and compute engines, and also uses Avatica as a
JDBC protocol. We are very grateful for the excellent artwork provided by
the Calcite community.

At the beginning of the project, Quicksql was confused about the
application areas. After one year of polishing, Quicksql has successfully
applied two areas:
1. Interactive Query Engine: Provides big data interactive query and BI
analysis with standard SQL syntax, and response time is in seconds to
minutes.
2. ETL Compute Engine: SQL-based ETL for multi-data source, which can use
optimization capabilities of SQL for data cleaning \ transformation \ join,
etc.
In the future, we will also focus on dynamic engine selection, so that
engines such as Hive, Spark, and Presto can run more suitable SQL.

Looking forward to working with the Calcite community to do some
interesting things and explore the unlimited possibilities of SQL

Siyuan Liu

On Mon, Mar 2, 2020 at 3:45 PM Francis Du  wrote:

> Hi everyone:
>
> Allow me to introduce my good friend Siyuan Liu, who is the leader of
> Quicksql project.
>
> I CC to him and ask him to introduce the project to us.Here is the
> documentation link for
>
> Quicksql [1].
>
> [1].  https://quicksql.readthedocs.io/en/latest/
>
> Regards,
> Francis
>
> Juan Pan  于2019年12月23日周一 上午11:44写道:
>
>> Thanks Gelbana,
>>
>>
>> Very appreciated your explanation, which sheds me some light on exploring
>> Calcite. :)
>>
>>
>> Best wishes,
>> Trista
>>
>>
>>  Juan Pan (Trista)
>>
>> Senior DBA & PPMC of Apache ShardingSphere(Incubating)
>> E-mail: panj...@apache.org
>>
>>
>>
>>
>> On 12/22/2019 05:58,Muhammad Gelbana wrote:
>> I am curious how to join the tables from different datasources.
>> Based on Calcite's conventions concept, the Join operator and its input
>> operators should all have the same convention. If they don't, the
>> convention different from the Join operator's convention will have to
>> register a converter rule. This rule should produce an operator that only
>> converts from that convention to the Join operator's convention.
>>
>> This way the Join operator will be able to handle the data obtained from
>> its input operators because it understands the data structure.
>>
>> Thanks,
>> Gelbana
>>
>>
>> On Wed, Dec 18, 2019 at 5:08 AM Juan Pan  wrote:
>>
>> Some updates.
>>
>>
>> Recently i took a look at their doc and source code, and found this
>> project uses SQL parsing and Relational algebra of Calcite to get query
>> plan, and also translates to spark SQL for joining different datasources,
>> or corresponding query for single datasource.
>>
>>
>> Although it copies many classes from Calcite, the idea of QuickSQL seems
>> some of interests, and code is succinct.
>>
>>
>> Best,
>> Trista
>>
>>
>> Juan Pan (Trista)
>>
>> Senior DBA & PPMC of Apache ShardingSphere(Incubating)
>> E-mail: panj...@apache.org
>>
>>
>>
>>
>> On 12/13/2019 17:16,Juan Pan wrote:
>> Yes, indeed.
>>
>>
>> Juan Pan (Trista)
>>
>> Senior DBA & PPMC of Apache ShardingSphere(Incubating)
>> E-mail: panj...@apache.org
>>
>>
>>
>>
>> On 12/12/2019 18:00,Alessandro Solimando
>> wrote:
>> Adapters must be needed by data sources not supporting SQL, I think this
>> is
>> what Juan Pan was asking for.
>>
>> On Thu, 12 Dec 2019 at 04:05, Haisheng Yuan  wrote:
>>
>> Nope

[jira] [Created] (CALCITE-2890) ElasticSearch adapter. Combine any_value with other aggregation functions failed

2019-03-03 Thread Siyuan Liu (JIRA)
Siyuan Liu created CALCITE-2890:
---

 Summary: ElasticSearch adapter. Combine any_value with other 
aggregation functions failed
 Key: CALCITE-2890
 URL: https://issues.apache.org/jira/browse/CALCITE-2890
 Project: Calcite
  Issue Type: Bug
  Components: elasticsearch-adapter
Affects Versions: 1.18.0
Reporter: Siyuan Liu


As Andrei Sereda provided in CALCITE-2669, the following test cases cannot pass.
{code:java}
// combine any_value with other aggregation functions (eg. max)
CalciteAssert.that()
  .with(newConnectionFactory())
  .query("select cat1, any_value(cat2), max(val1) from view group by cat1")
  .returnsUnordered("cat1=a; EXPR$1=g; EXPR$2=1.0",
"cat1=null; EXPR$1=g; EXPR$2=null",
"cat1=b; EXPR$1=h; EXPR$2=7.0");

CalciteAssert.that()
  .with(newConnectionFactory())
  .query("select max(val1), cat1, any_value(cat2) from view group by cat1")
  .returnsUnordered("EXPR$0=1.0; cat1=a; EXPR$2=g",
"EXPR$0=null; cat1=null; EXPR$2=g",
"EXPR$0=7.0; cat1=b; EXPR$2=h");

CalciteAssert.that()
  .with(newConnectionFactory())
  .query("select any_value(cat2), cat1, max(val1) from view group by cat1")
  .returnsUnordered("EXPR$0=g; cat1=a; EXPR$2=1.0",
"EXPR$0=g; cat1=null; EXPR$2=null",
"EXPR$0=h; cat1=b; EXPR$2=7.0");
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CALCITE-2679) Group by without aggregation function cannot be translated to correct JSON、

2018-11-16 Thread Siyuan Liu (JIRA)
Siyuan Liu created CALCITE-2679:
---

 Summary: Group by without aggregation function cannot be 
translated to correct JSON、
 Key: CALCITE-2679
 URL: https://issues.apache.org/jira/browse/CALCITE-2679
 Project: Calcite
  Issue Type: Bug
Reporter: Siyuan Liu
Assignee: Julian Hyde






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CALCITE-2525) ConcurrentModificationException may be triggered in ElasticsearchProject

2018-09-04 Thread Siyuan Liu (JIRA)
Siyuan Liu created CALCITE-2525:
---

 Summary: ConcurrentModificationException may be triggered in 
ElasticsearchProject
 Key: CALCITE-2525
 URL: https://issues.apache.org/jira/browse/CALCITE-2525
 Project: Calcite
  Issue Type: Bug
  Components: elasticsearch-adapter
Affects Versions: next
Reporter: Siyuan Liu
Assignee: Julian Hyde
 Fix For: next


{code:java}
//in ElasticsearchProject
for (String opfield : implementor.list) {
  if (opfield.startsWith("\"_source\"")) {
implementor.list.remove(opfield);
  }
}
{code}
The ConcurrentModificationException will be trigged when `opField` which are 
iterating on is removed. This code should be replaced with 
list.removeIf(Predicate)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)