Hello,

>Have anyone done any benchmark to evaluate calcite’s performance impact?
Or is there any documentation regarding performance concern?

Well, the performance depends on your use case.
As far as I understand, here are the typical features:
1) Given a query, Calcite would try to push all the tables/predicates to
the downstream executor (i.e. DB)
2) In case there are joins between different data stores, Calcite would
still push as much filters as it is possible, yet perform the join in memory
3) Calcite has no idea which indices are available at the storage level,
thus don't expect it to generate plans like "for each row from the
datastore1 go and fetch a relevant row from datastore2". In 100% of the
cases it would be "hashjoin(full fetch datastore1, full fetch datastore2)"


In case you are going to use Calcite as a proxy (that is Calcite would
parse and just send the whole query downstream), then you might be
interested in JMH-based benchmarks.
Here they are:
https://github.com/apache/calcite/blob/master/ubenchmark/src/main/java/org/apache/calcite/benchmarks/StatementTest.java
Feel free to add more benchmarks there.


PS. Index support is doable (one can fetch the sets of indexes from the
downstream datastores), however it is not done yet.

Vladimir

Reply via email to