Question on how to integrate Apache IoTDB into Calcite

Julian Feinauer Fri, 21 Jan 2022 04:21:52 -0800

Hi all,

in the last weeks I worked on Integrating the Apache IoTDB Project with Calcite.
This covers two possible scenarios. One, to use Apache IoTDB as an Adapter in 
Apache Calcite (like MongoDB, Cassandra, et al) and on the other hand we are 
looking at using Calcites Query Optimizer to introduce indexing into the IoTDB 
server (the IoTDB Server builds a RelNode / Tree and passes it to the planner, 
after planning the resulting RelNode is then processed further by the IoTDB 
Server, executed and returned).


I looked a lot on the other Adapters and how they are implemented and have some 
questions:

One rather general question is about the Queryable<> Interface. I tried to look 
up all the docs (also in Linq) but still not fully understand it. From my 
understanding it is like a Enumerable<> but it has a “native” way to already to 
things like ordering or filtering. So if I have a Queryable<> which implements 
a custom Filter an automated “Push Down” can be done by the framework without a 
Rule or code generation.

One important requirement for us in IoTDB is to do the query pushdown to the 
TableScan (which is done implicitly in the current server but is first explicit 
in the RelNode that we generate).
So whats the best way to “merge” a LogicalFilter and a IoTDBTableScan to a 
“filtered” scan?
Is the right way to return a QueryableTable as TableScan and the Planner will 
take care by generating the call to ‘.filter(…)’.
The same applies to ordering.

Another question that is important for us is the usage of “Materialized Views” 
or other “Indexes”.
As we handle basically always timeseries in most cases the only suitable index 
is a “Materialized View” on parts of the time series which we can use to 
replace parts of the Relational Tree to avoid IO and computation for parts that 
are already precomputed.

Is there already an existing support for that in Calcite or would we just write 
custom Rules for our cases?

My last question is about the Callable TraitDef. So far I only used Enumerable 
Convention which results in  Code generation (which has an impact on the query 
latency). Am I right in assuming that the Binable Convention is somehow similar 
to the Enumerable Convention with the only difference that it does not do code 
generation but interpretation?
And to potentially use both (depending on whatever switch we set) we just have 
to provide Converter Rules for both?
What would you use in a Server setup? Always Enumerable?

Thanks already for any responses or hints!
Julian F

Question on how to integrate Apache IoTDB into Calcite

Reply via email to