You may want to keep an eye on CALCITE-2040 (
https://issues.apache.org/jira/browse/CALCITE-2040). I have a student who
is working on a Calcite adapter for Apache Arrow. We're basically hung up
waiting on the Arrow team to release a compatible JAR. This still won't
fully solve your problem though as the first version of the adapter is only
capable of reading from Arrow files. However, the goal is eventually to
allow passing a memory reference into the adapter so that it would be
possible to make use of Arrow data which is constructed in-memory elsewhere.
--
Michael Mior
mm...@apache.org


Le dim. 30 janv. 2022 à 17:36, Nicola Vitucci <nicola.vitu...@gmail.com> a
écrit :

> Hi all,
>
> What would be the best way to use Calcite with Python? I've come up with
> two potential solutions:
>
> - using the jaydebeapi package, to connect via the JDBC driver directly
> from a JVM created via jpype;
> - using Apache Arrow via the pyarrow package, to connect in basically the
> same way but creating Arrow objects with JdbcToArrowUtils (and optionally
> converting them to Pandas).
>
> Although the former is more straightforward, the latter allows to achieve
> better performance (see [1] for instance) since it's exactly what Arrow is
> for. I've created two Jupyter notebooks [2] showing each solution. What
> would you recommend? Is there an even better approach?
>
> Thanks,
>
> Nicola
>
> [1] https://uwekorn.com/2020/12/30/fast-jdbc-revisited.html
> [2] https://github.com/nvitucci/calcite-sparql/tree/v0.0.2/examples/python
>

Reply via email to