The taxonomy seems about data stores (one or many) and supported query languages (one or many). Calcite is ‘many’ in each category, even though it appears to be one query language (SQL) and no data stores.
TL;DR: I would put it in the ‘polystore’ category. “Query answering” is one of its functions, but many projects use Calcite for query-planning (either embedded or as a service). Calcite mainly supports SQL because it is the most important language in the industry. But the core of Calcite is the relational algebra and rewrite rules, and SQL is just one relational language. Calcite also has implementations of Pig Latin, LINQ, and externally Morel and (I believe) Datalog. Calcite’s adapters allow it to support data stores. Again, SQL is the biggest language (via the JDBC adapter) but there are also adapters to CSV files, Elasticsearch, Druid, InnoDB, Kafka. The dimension missing from your diagram is the processing engine. Spark and Hadoop were initially distributed processing engines, and languages and data stores were added later. Like those systems, Calcite is agnostic about language and data store, but it does not have a strong processing engine - though its ‘enumerable’ engine, with ~600 functions and processing in a single JVM, is good enough for many purposes. Julian > On Jan 29, 2024, at 1:27 AM, Teun Mathijssen <[email protected]> > wrote: > > Hi Francis, > > Yes, my visualization can be found here: https://imgur.com/a/pawHWXf > > Kind regards, > > Teun Mathijssen > > On 2024/01/29 07:52:02 Francis Chuang wrote: >> Hello Teun, >> >> It seems your attachment didn't come through. Can you upload it to imgr >> and link it here? >> >> Thanks, >> Francis >> >> On 29/01/2024 6:45 pm, Teun Mathijssen wrote: >>> Hi all, >>> >>> I'm currently writing my Master's thesis in Software Engineering on >>> Apache Calcite. For my background research I want to classify Calcite >>> under a taxonomy formulated in "Enabling Query Processing across >>> Heterogeneous Data Models: A Survey" >>> (https://ieeexplore.ieee.org/abstract/document/8258302 >>> <https://ieeexplore.ieee.org/abstract/document/8258302>). The authors > of >>> this paper classify federated datastores from available literature > under >>> the data type and query interfaces they support. I made a visualization >>> of their taxonomy which can be found in the attached file. >>> >>> My question to you is: under which of these classes can we put Calcite? >>> I think Calcite is a multistore system (with the sidenote that it does >>> not manage its own data) as it provides one SQL interface and it's >>> capable of handling heterogeneous data. What do you think? Can we even >>> categorize it? >>> >>> Looking forward to your answer. >>> >>> Kind regards, >>> >>> Teun Mathijssen >>
