The taxonomy seems about data stores (one or many) and supported query 
languages (one or many). Calcite is ‘many’ in each category, even though it 
appears to be one query language (SQL) and no data stores. 

TL;DR: I would put it in the ‘polystore’ category. “Query answering” is one of 
its functions, but many projects use Calcite for query-planning (either 
embedded or as a service).

Calcite mainly supports SQL because it is the most important language in the 
industry. But the core of Calcite is the relational algebra and rewrite rules, 
and SQL is just one relational language. Calcite also has implementations of 
Pig Latin, LINQ, and externally Morel and (I believe) Datalog.

Calcite’s adapters allow it to support data stores. Again, SQL is the biggest 
language (via the JDBC adapter) but there are also adapters to CSV files, 
Elasticsearch, Druid, InnoDB, Kafka.

The dimension missing from your diagram is the processing engine. Spark and 
Hadoop were initially distributed processing engines, and languages and data 
stores were added later. Like those systems, Calcite is agnostic about language 
and data store, but it does not have a strong processing engine - though its 
‘enumerable’ engine, with ~600 functions and processing in a single JVM, is 
good enough for many purposes.

Julian



> On Jan 29, 2024, at 1:27 AM, Teun Mathijssen <teun.c.mathijs...@gmail.com> 
> wrote:
> 
> Hi Francis,
> 
> Yes, my visualization can be found here: https://imgur.com/a/pawHWXf
> 
> Kind regards,
> 
> Teun Mathijssen
> 
> On 2024/01/29 07:52:02 Francis Chuang wrote:
>> Hello Teun,
>> 
>> It seems your attachment didn't come through. Can you upload it to imgr
>> and link it here?
>> 
>> Thanks,
>> Francis
>> 
>> On 29/01/2024 6:45 pm, Teun Mathijssen wrote:
>>> Hi all,
>>> 
>>> I'm currently writing my Master's thesis in Software Engineering on
>>> Apache Calcite. For my background research I want to classify Calcite
>>> under a taxonomy formulated in "Enabling Query Processing across
>>> Heterogeneous Data Models: A Survey"
>>> (https://ieeexplore.ieee.org/abstract/document/8258302
>>> <https://ieeexplore.ieee.org/abstract/document/8258302>). The authors
> of
>>> this paper classify federated datastores from available literature
> under
>>> the data type and query interfaces they support. I made a visualization
>>> of their taxonomy which can be found in the attached file.
>>> 
>>> My question to you is: under which of these classes can we put Calcite?
>>> I think Calcite is a multistore system (with the sidenote that it does
>>> not manage its own data) as it provides one SQL interface and it's
>>> capable of handling heterogeneous data. What do you think? Can we even
>>> categorize it?
>>> 
>>> Looking forward to your answer.
>>> 
>>> Kind regards,
>>> 
>>> Teun Mathijssen
>> 

Reply via email to