Re: Using Calcite with Python
Thanks, Jacques. I've looked at dask-sql a few days ago, but the only use of Calcite (via jpype) is for query planning. I'll follow your work on GraalVM with interest. Nicola Il giorno mar 1 feb 2022 alle ore 00:58 Jacques Nadeau ha scritto: > A couple of related (possibly useful?) pointers here: > >- Dask-sql [1] uses Calcite in a python context. Might be some good >stuff to leverage there. >- I'm working on compiling Calcite as a GraalVM shared native library >[2] as part of Substrait [3] with the goal of ultimately having a > friendly >C binding [4] for use in non-jvm worlds. This connects to work being > done >by others to support tools like Arrow and Velox [5] as Substrait targets >(and thus completing the path from c interface to native execution via >Calcite). > > > [1] https://github.com/dask-contrib/dask-sql > [2] https://issues.apache.org/jira/browse/CALCITE-4786 > [3] https://github.com/substrait-io/substrait/pull/120 > [4] https://github.com/jacques-n/substrait/pull/3 > [5] https://github.com/oap-project/gazelle-jni/tree/velox_dev > > On Mon, Jan 31, 2022 at 3:32 PM Nicola Vitucci > wrote: > > > Hi Eugen, Michael, Gavin, > > > > Thank you very much for your input. Answering to your suggestions: > > > > - Phoenix client: I saw it but decided not to use it because it does not > > seem very active and up to date (its Avatica version is 1.10, while > latest > > is 1.20). I may still give it a try though. > > - Arrow Flight: I think it can be very useful especially, like Michael > > mentioned, if it were integrated with Avatica as a transport; at the > > moment, though, it is not. > > > > I am basically looking for a (relatively) easy and ready to implement, > easy > > to keep up to date, and reasonably performant solution. Although it > incurs > > some overhead, a solution based on Python + Java seems to me the most > > reasonable for the time being. Do you have any other suggestions or > > recommendations? > > > > Thanks again, > > > > Nicola > > > > > > > > Il giorno lun 31 gen 2022 alle ore 17:04 Michael Mior > > ha > > scritto: > > > > > Flight is definitely another consideration for the future. Personally I > > > think it would be most interesting to integrate Flight with Avatica as > an > > > alternative transport. But it would certainly also be useful to allow > the > > > Arrow adapter to connect to any Flight endpoint. > > > > > > -- > > > Michael Mior > > > mm...@apache.org > > > > > > > > > Le lun. 31 janv. 2022 à 10:00, Gavin Ray a > > écrit : > > > > > > > This is really interesting stuff you've done in the example notebooks > > > > > > > > Nicola & Michael, I wonder if you could benefit from the > > > recently-released > > > > Arrow Flight SQL? > > > > > > > > > > > > > > https://www.dremio.com/subsurface/arrow-flight-and-arrow-flight-sql-accelerating-data-movement/ > > > > > > > > I have asked Jacques about this a bit -- it's meant to be a > > > standardization > > > > for communicating SQL queries and metadata with Arrow. > > > > I'm not intimately familiar with it, but it seems like it could be a > > good > > > > base to build a Calcite backend for Arrow from? > > > > > > > > They have a pretty thorough Java example in the repository: > > > > > > > > > > > > > > https://github.com/apache/arrow/blob/968e6ea488c939c0e1f2bfe339a5a9ed1aed603e/java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlExample.java#L169-L180 > > > > > > > > On Mon, Jan 31, 2022 at 8:47 AM Michael Mior > wrote: > > > > > > > > > You may want to keep an eye on CALCITE-2040 ( > > > > > https://issues.apache.org/jira/browse/CALCITE-2040). I have a > > student > > > > who > > > > > is working on a Calcite adapter for Apache Arrow. We're basically > > hung > > > up > > > > > waiting on the Arrow team to release a compatible JAR. This still > > won't > > > > > fully solve your problem though as the first version of the adapter > > is > > > > only > > > > > capable of reading from Arrow files. However, the goal is > eventually > > to > > > > > allow passing a memory reference into the adapter so that it would > be > > > > > possible to make use of Arrow data which is constructed in-memory > > > > > elsewhere. > > > > > -- > > > > > Michael Mior > > > > > mm...@apache.org > > > > > > > > > > > > > > > Le dim. 30 janv. 2022 à 17:36, Nicola Vitucci < > > > nicola.vitu...@gmail.com> > > > > a > > > > > écrit : > > > > > > > > > > > Hi all, > > > > > > > > > > > > What would be the best way to use Calcite with Python? I've come > up > > > > with > > > > > > two potential solutions: > > > > > > > > > > > > - using the jaydebeapi package, to connect via the JDBC driver > > > directly > > > > > > from a JVM created via jpype; > > > > > > - using Apache Arrow via the pyarrow package, to connect in > > basically > > > > the > > > > > > same way but creating Arrow objects with JdbcToArrowUtils (and > > > > optionally > > > > > > converting them to Pandas). > > >
Re: Using Calcite with Python
I have nothing of value to add, but: > [5] https://github.com/oap-project/gazelle-jni/tree/velox_dev Hot damn this is neat On Mon, Jan 31, 2022 at 7:58 PM Jacques Nadeau wrote: > A couple of related (possibly useful?) pointers here: > >- Dask-sql [1] uses Calcite in a python context. Might be some good >stuff to leverage there. >- I'm working on compiling Calcite as a GraalVM shared native library >[2] as part of Substrait [3] with the goal of ultimately having a > friendly >C binding [4] for use in non-jvm worlds. This connects to work being > done >by others to support tools like Arrow and Velox [5] as Substrait targets >(and thus completing the path from c interface to native execution via >Calcite). > > > [1] https://github.com/dask-contrib/dask-sql > [2] https://issues.apache.org/jira/browse/CALCITE-4786 > [3] https://github.com/substrait-io/substrait/pull/120 > [4] https://github.com/jacques-n/substrait/pull/3 > [5] https://github.com/oap-project/gazelle-jni/tree/velox_dev > > On Mon, Jan 31, 2022 at 3:32 PM Nicola Vitucci > wrote: > > > Hi Eugen, Michael, Gavin, > > > > Thank you very much for your input. Answering to your suggestions: > > > > - Phoenix client: I saw it but decided not to use it because it does not > > seem very active and up to date (its Avatica version is 1.10, while > latest > > is 1.20). I may still give it a try though. > > - Arrow Flight: I think it can be very useful especially, like Michael > > mentioned, if it were integrated with Avatica as a transport; at the > > moment, though, it is not. > > > > I am basically looking for a (relatively) easy and ready to implement, > easy > > to keep up to date, and reasonably performant solution. Although it > incurs > > some overhead, a solution based on Python + Java seems to me the most > > reasonable for the time being. Do you have any other suggestions or > > recommendations? > > > > Thanks again, > > > > Nicola > > > > > > > > Il giorno lun 31 gen 2022 alle ore 17:04 Michael Mior > > ha > > scritto: > > > > > Flight is definitely another consideration for the future. Personally I > > > think it would be most interesting to integrate Flight with Avatica as > an > > > alternative transport. But it would certainly also be useful to allow > the > > > Arrow adapter to connect to any Flight endpoint. > > > > > > -- > > > Michael Mior > > > mm...@apache.org > > > > > > > > > Le lun. 31 janv. 2022 à 10:00, Gavin Ray a > > écrit : > > > > > > > This is really interesting stuff you've done in the example notebooks > > > > > > > > Nicola & Michael, I wonder if you could benefit from the > > > recently-released > > > > Arrow Flight SQL? > > > > > > > > > > > > > > https://www.dremio.com/subsurface/arrow-flight-and-arrow-flight-sql-accelerating-data-movement/ > > > > > > > > I have asked Jacques about this a bit -- it's meant to be a > > > standardization > > > > for communicating SQL queries and metadata with Arrow. > > > > I'm not intimately familiar with it, but it seems like it could be a > > good > > > > base to build a Calcite backend for Arrow from? > > > > > > > > They have a pretty thorough Java example in the repository: > > > > > > > > > > > > > > https://github.com/apache/arrow/blob/968e6ea488c939c0e1f2bfe339a5a9ed1aed603e/java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlExample.java#L169-L180 > > > > > > > > On Mon, Jan 31, 2022 at 8:47 AM Michael Mior > wrote: > > > > > > > > > You may want to keep an eye on CALCITE-2040 ( > > > > > https://issues.apache.org/jira/browse/CALCITE-2040). I have a > > student > > > > who > > > > > is working on a Calcite adapter for Apache Arrow. We're basically > > hung > > > up > > > > > waiting on the Arrow team to release a compatible JAR. This still > > won't > > > > > fully solve your problem though as the first version of the adapter > > is > > > > only > > > > > capable of reading from Arrow files. However, the goal is > eventually > > to > > > > > allow passing a memory reference into the adapter so that it would > be > > > > > possible to make use of Arrow data which is constructed in-memory > > > > > elsewhere. > > > > > -- > > > > > Michael Mior > > > > > mm...@apache.org > > > > > > > > > > > > > > > Le dim. 30 janv. 2022 à 17:36, Nicola Vitucci < > > > nicola.vitu...@gmail.com> > > > > a > > > > > écrit : > > > > > > > > > > > Hi all, > > > > > > > > > > > > What would be the best way to use Calcite with Python? I've come > up > > > > with > > > > > > two potential solutions: > > > > > > > > > > > > - using the jaydebeapi package, to connect via the JDBC driver > > > directly > > > > > > from a JVM created via jpype; > > > > > > - using Apache Arrow via the pyarrow package, to connect in > > basically > > > > the > > > > > > same way but creating Arrow objects with JdbcToArrowUtils (and > > > > optionally > > > > > > converting them to Pandas). > > > > > > > > > > > > Although the former is more straightforward,
Re: Using Calcite with Python
A couple of related (possibly useful?) pointers here: - Dask-sql [1] uses Calcite in a python context. Might be some good stuff to leverage there. - I'm working on compiling Calcite as a GraalVM shared native library [2] as part of Substrait [3] with the goal of ultimately having a friendly C binding [4] for use in non-jvm worlds. This connects to work being done by others to support tools like Arrow and Velox [5] as Substrait targets (and thus completing the path from c interface to native execution via Calcite). [1] https://github.com/dask-contrib/dask-sql [2] https://issues.apache.org/jira/browse/CALCITE-4786 [3] https://github.com/substrait-io/substrait/pull/120 [4] https://github.com/jacques-n/substrait/pull/3 [5] https://github.com/oap-project/gazelle-jni/tree/velox_dev On Mon, Jan 31, 2022 at 3:32 PM Nicola Vitucci wrote: > Hi Eugen, Michael, Gavin, > > Thank you very much for your input. Answering to your suggestions: > > - Phoenix client: I saw it but decided not to use it because it does not > seem very active and up to date (its Avatica version is 1.10, while latest > is 1.20). I may still give it a try though. > - Arrow Flight: I think it can be very useful especially, like Michael > mentioned, if it were integrated with Avatica as a transport; at the > moment, though, it is not. > > I am basically looking for a (relatively) easy and ready to implement, easy > to keep up to date, and reasonably performant solution. Although it incurs > some overhead, a solution based on Python + Java seems to me the most > reasonable for the time being. Do you have any other suggestions or > recommendations? > > Thanks again, > > Nicola > > > > Il giorno lun 31 gen 2022 alle ore 17:04 Michael Mior > ha > scritto: > > > Flight is definitely another consideration for the future. Personally I > > think it would be most interesting to integrate Flight with Avatica as an > > alternative transport. But it would certainly also be useful to allow the > > Arrow adapter to connect to any Flight endpoint. > > > > -- > > Michael Mior > > mm...@apache.org > > > > > > Le lun. 31 janv. 2022 à 10:00, Gavin Ray a > écrit : > > > > > This is really interesting stuff you've done in the example notebooks > > > > > > Nicola & Michael, I wonder if you could benefit from the > > recently-released > > > Arrow Flight SQL? > > > > > > > > > https://www.dremio.com/subsurface/arrow-flight-and-arrow-flight-sql-accelerating-data-movement/ > > > > > > I have asked Jacques about this a bit -- it's meant to be a > > standardization > > > for communicating SQL queries and metadata with Arrow. > > > I'm not intimately familiar with it, but it seems like it could be a > good > > > base to build a Calcite backend for Arrow from? > > > > > > They have a pretty thorough Java example in the repository: > > > > > > > > > https://github.com/apache/arrow/blob/968e6ea488c939c0e1f2bfe339a5a9ed1aed603e/java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlExample.java#L169-L180 > > > > > > On Mon, Jan 31, 2022 at 8:47 AM Michael Mior wrote: > > > > > > > You may want to keep an eye on CALCITE-2040 ( > > > > https://issues.apache.org/jira/browse/CALCITE-2040). I have a > student > > > who > > > > is working on a Calcite adapter for Apache Arrow. We're basically > hung > > up > > > > waiting on the Arrow team to release a compatible JAR. This still > won't > > > > fully solve your problem though as the first version of the adapter > is > > > only > > > > capable of reading from Arrow files. However, the goal is eventually > to > > > > allow passing a memory reference into the adapter so that it would be > > > > possible to make use of Arrow data which is constructed in-memory > > > > elsewhere. > > > > -- > > > > Michael Mior > > > > mm...@apache.org > > > > > > > > > > > > Le dim. 30 janv. 2022 à 17:36, Nicola Vitucci < > > nicola.vitu...@gmail.com> > > > a > > > > écrit : > > > > > > > > > Hi all, > > > > > > > > > > What would be the best way to use Calcite with Python? I've come up > > > with > > > > > two potential solutions: > > > > > > > > > > - using the jaydebeapi package, to connect via the JDBC driver > > directly > > > > > from a JVM created via jpype; > > > > > - using Apache Arrow via the pyarrow package, to connect in > basically > > > the > > > > > same way but creating Arrow objects with JdbcToArrowUtils (and > > > optionally > > > > > converting them to Pandas). > > > > > > > > > > Although the former is more straightforward, the latter allows to > > > achieve > > > > > better performance (see [1] for instance) since it's exactly what > > Arrow > > > > is > > > > > for. I've created two Jupyter notebooks [2] showing each solution. > > What > > > > > would you recommend? Is there an even better approach? > > > > > > > > > > Thanks, > > > > > > > > > > Nicola > > > > > > > > > > [1] https://uwekorn.com/2020/12/30/fast-jdbc-revisited.html > > > > > [2] > > > > > https://github.com/nv
Re: Using Calcite with Python
Hi Eugen, Michael, Gavin, Thank you very much for your input. Answering to your suggestions: - Phoenix client: I saw it but decided not to use it because it does not seem very active and up to date (its Avatica version is 1.10, while latest is 1.20). I may still give it a try though. - Arrow Flight: I think it can be very useful especially, like Michael mentioned, if it were integrated with Avatica as a transport; at the moment, though, it is not. I am basically looking for a (relatively) easy and ready to implement, easy to keep up to date, and reasonably performant solution. Although it incurs some overhead, a solution based on Python + Java seems to me the most reasonable for the time being. Do you have any other suggestions or recommendations? Thanks again, Nicola Il giorno lun 31 gen 2022 alle ore 17:04 Michael Mior ha scritto: > Flight is definitely another consideration for the future. Personally I > think it would be most interesting to integrate Flight with Avatica as an > alternative transport. But it would certainly also be useful to allow the > Arrow adapter to connect to any Flight endpoint. > > -- > Michael Mior > mm...@apache.org > > > Le lun. 31 janv. 2022 à 10:00, Gavin Ray a écrit : > > > This is really interesting stuff you've done in the example notebooks > > > > Nicola & Michael, I wonder if you could benefit from the > recently-released > > Arrow Flight SQL? > > > > > https://www.dremio.com/subsurface/arrow-flight-and-arrow-flight-sql-accelerating-data-movement/ > > > > I have asked Jacques about this a bit -- it's meant to be a > standardization > > for communicating SQL queries and metadata with Arrow. > > I'm not intimately familiar with it, but it seems like it could be a good > > base to build a Calcite backend for Arrow from? > > > > They have a pretty thorough Java example in the repository: > > > > > https://github.com/apache/arrow/blob/968e6ea488c939c0e1f2bfe339a5a9ed1aed603e/java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlExample.java#L169-L180 > > > > On Mon, Jan 31, 2022 at 8:47 AM Michael Mior wrote: > > > > > You may want to keep an eye on CALCITE-2040 ( > > > https://issues.apache.org/jira/browse/CALCITE-2040). I have a student > > who > > > is working on a Calcite adapter for Apache Arrow. We're basically hung > up > > > waiting on the Arrow team to release a compatible JAR. This still won't > > > fully solve your problem though as the first version of the adapter is > > only > > > capable of reading from Arrow files. However, the goal is eventually to > > > allow passing a memory reference into the adapter so that it would be > > > possible to make use of Arrow data which is constructed in-memory > > > elsewhere. > > > -- > > > Michael Mior > > > mm...@apache.org > > > > > > > > > Le dim. 30 janv. 2022 à 17:36, Nicola Vitucci < > nicola.vitu...@gmail.com> > > a > > > écrit : > > > > > > > Hi all, > > > > > > > > What would be the best way to use Calcite with Python? I've come up > > with > > > > two potential solutions: > > > > > > > > - using the jaydebeapi package, to connect via the JDBC driver > directly > > > > from a JVM created via jpype; > > > > - using Apache Arrow via the pyarrow package, to connect in basically > > the > > > > same way but creating Arrow objects with JdbcToArrowUtils (and > > optionally > > > > converting them to Pandas). > > > > > > > > Although the former is more straightforward, the latter allows to > > achieve > > > > better performance (see [1] for instance) since it's exactly what > Arrow > > > is > > > > for. I've created two Jupyter notebooks [2] showing each solution. > What > > > > would you recommend? Is there an even better approach? > > > > > > > > Thanks, > > > > > > > > Nicola > > > > > > > > [1] https://uwekorn.com/2020/12/30/fast-jdbc-revisited.html > > > > [2] > > > https://github.com/nvitucci/calcite-sparql/tree/v0.0.2/examples/python > > > > > > > > > >
Re: Using Calcite with Python
Flight is definitely another consideration for the future. Personally I think it would be most interesting to integrate Flight with Avatica as an alternative transport. But it would certainly also be useful to allow the Arrow adapter to connect to any Flight endpoint. -- Michael Mior mm...@apache.org Le lun. 31 janv. 2022 à 10:00, Gavin Ray a écrit : > This is really interesting stuff you've done in the example notebooks > > Nicola & Michael, I wonder if you could benefit from the recently-released > Arrow Flight SQL? > > https://www.dremio.com/subsurface/arrow-flight-and-arrow-flight-sql-accelerating-data-movement/ > > I have asked Jacques about this a bit -- it's meant to be a standardization > for communicating SQL queries and metadata with Arrow. > I'm not intimately familiar with it, but it seems like it could be a good > base to build a Calcite backend for Arrow from? > > They have a pretty thorough Java example in the repository: > > https://github.com/apache/arrow/blob/968e6ea488c939c0e1f2bfe339a5a9ed1aed603e/java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlExample.java#L169-L180 > > On Mon, Jan 31, 2022 at 8:47 AM Michael Mior wrote: > > > You may want to keep an eye on CALCITE-2040 ( > > https://issues.apache.org/jira/browse/CALCITE-2040). I have a student > who > > is working on a Calcite adapter for Apache Arrow. We're basically hung up > > waiting on the Arrow team to release a compatible JAR. This still won't > > fully solve your problem though as the first version of the adapter is > only > > capable of reading from Arrow files. However, the goal is eventually to > > allow passing a memory reference into the adapter so that it would be > > possible to make use of Arrow data which is constructed in-memory > > elsewhere. > > -- > > Michael Mior > > mm...@apache.org > > > > > > Le dim. 30 janv. 2022 à 17:36, Nicola Vitucci > a > > écrit : > > > > > Hi all, > > > > > > What would be the best way to use Calcite with Python? I've come up > with > > > two potential solutions: > > > > > > - using the jaydebeapi package, to connect via the JDBC driver directly > > > from a JVM created via jpype; > > > - using Apache Arrow via the pyarrow package, to connect in basically > the > > > same way but creating Arrow objects with JdbcToArrowUtils (and > optionally > > > converting them to Pandas). > > > > > > Although the former is more straightforward, the latter allows to > achieve > > > better performance (see [1] for instance) since it's exactly what Arrow > > is > > > for. I've created two Jupyter notebooks [2] showing each solution. What > > > would you recommend? Is there an even better approach? > > > > > > Thanks, > > > > > > Nicola > > > > > > [1] https://uwekorn.com/2020/12/30/fast-jdbc-revisited.html > > > [2] > > https://github.com/nvitucci/calcite-sparql/tree/v0.0.2/examples/python > > > > > >
Re: Using Calcite with Python
This is really interesting stuff you've done in the example notebooks Nicola & Michael, I wonder if you could benefit from the recently-released Arrow Flight SQL? https://www.dremio.com/subsurface/arrow-flight-and-arrow-flight-sql-accelerating-data-movement/ I have asked Jacques about this a bit -- it's meant to be a standardization for communicating SQL queries and metadata with Arrow. I'm not intimately familiar with it, but it seems like it could be a good base to build a Calcite backend for Arrow from? They have a pretty thorough Java example in the repository: https://github.com/apache/arrow/blob/968e6ea488c939c0e1f2bfe339a5a9ed1aed603e/java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlExample.java#L169-L180 On Mon, Jan 31, 2022 at 8:47 AM Michael Mior wrote: > You may want to keep an eye on CALCITE-2040 ( > https://issues.apache.org/jira/browse/CALCITE-2040). I have a student who > is working on a Calcite adapter for Apache Arrow. We're basically hung up > waiting on the Arrow team to release a compatible JAR. This still won't > fully solve your problem though as the first version of the adapter is only > capable of reading from Arrow files. However, the goal is eventually to > allow passing a memory reference into the adapter so that it would be > possible to make use of Arrow data which is constructed in-memory > elsewhere. > -- > Michael Mior > mm...@apache.org > > > Le dim. 30 janv. 2022 à 17:36, Nicola Vitucci a > écrit : > > > Hi all, > > > > What would be the best way to use Calcite with Python? I've come up with > > two potential solutions: > > > > - using the jaydebeapi package, to connect via the JDBC driver directly > > from a JVM created via jpype; > > - using Apache Arrow via the pyarrow package, to connect in basically the > > same way but creating Arrow objects with JdbcToArrowUtils (and optionally > > converting them to Pandas). > > > > Although the former is more straightforward, the latter allows to achieve > > better performance (see [1] for instance) since it's exactly what Arrow > is > > for. I've created two Jupyter notebooks [2] showing each solution. What > > would you recommend? Is there an even better approach? > > > > Thanks, > > > > Nicola > > > > [1] https://uwekorn.com/2020/12/30/fast-jdbc-revisited.html > > [2] > https://github.com/nvitucci/calcite-sparql/tree/v0.0.2/examples/python > > >
Re: Using Calcite with Python
You may want to keep an eye on CALCITE-2040 ( https://issues.apache.org/jira/browse/CALCITE-2040). I have a student who is working on a Calcite adapter for Apache Arrow. We're basically hung up waiting on the Arrow team to release a compatible JAR. This still won't fully solve your problem though as the first version of the adapter is only capable of reading from Arrow files. However, the goal is eventually to allow passing a memory reference into the adapter so that it would be possible to make use of Arrow data which is constructed in-memory elsewhere. -- Michael Mior mm...@apache.org Le dim. 30 janv. 2022 à 17:36, Nicola Vitucci a écrit : > Hi all, > > What would be the best way to use Calcite with Python? I've come up with > two potential solutions: > > - using the jaydebeapi package, to connect via the JDBC driver directly > from a JVM created via jpype; > - using Apache Arrow via the pyarrow package, to connect in basically the > same way but creating Arrow objects with JdbcToArrowUtils (and optionally > converting them to Pandas). > > Although the former is more straightforward, the latter allows to achieve > better performance (see [1] for instance) since it's exactly what Arrow is > for. I've created two Jupyter notebooks [2] showing each solution. What > would you recommend? Is there an even better approach? > > Thanks, > > Nicola > > [1] https://uwekorn.com/2020/12/30/fast-jdbc-revisited.html > [2] https://github.com/nvitucci/calcite-sparql/tree/v0.0.2/examples/python >
Re: Using Calcite with Python
Hi Nicola, It's a question I was asking myself the other day. I don't know the answer but I do have an exploration direction: Avatica client. There is some nice description and diagram here https://calcite.apache.org/avatica/docs/ And also a list of clients down bellow. See https://calcite.apache.org/avatica/docs/#apache-phoenix-database-adapter-for-python Please let me know how it goes and what you find out. On 31.01.2022 00:35, Nicola Vitucci wrote: Hi all, What would be the best way to use Calcite with Python? I've come up with two potential solutions: - using the jaydebeapi package, to connect via the JDBC driver directly from a JVM created via jpype; - using Apache Arrow via the pyarrow package, to connect in basically the same way but creating Arrow objects with JdbcToArrowUtils (and optionally converting them to Pandas). Although the former is more straightforward, the latter allows to achieve better performance (see [1] for instance) since it's exactly what Arrow is for. I've created two Jupyter notebooks [2] showing each solution. What would you recommend? Is there an even better approach? Thanks, Nicola [1] https://uwekorn.com/2020/12/30/fast-jdbc-revisited.html [2] https://github.com/nvitucci/calcite-sparql/tree/v0.0.2/examples/python Regards, -- Eugen Stan +40770 941 271 / https://www.netdava.combegin:vcard fn:Eugen Stan n:Stan;Eugen email;internet:eugen.s...@netdava.com tel;cell:+40720898747 x-mozilla-html:FALSE url:https://www.netdava.com version:2.1 end:vcard
Using Calcite with Python
Hi all, What would be the best way to use Calcite with Python? I've come up with two potential solutions: - using the jaydebeapi package, to connect via the JDBC driver directly from a JVM created via jpype; - using Apache Arrow via the pyarrow package, to connect in basically the same way but creating Arrow objects with JdbcToArrowUtils (and optionally converting them to Pandas). Although the former is more straightforward, the latter allows to achieve better performance (see [1] for instance) since it's exactly what Arrow is for. I've created two Jupyter notebooks [2] showing each solution. What would you recommend? Is there an even better approach? Thanks, Nicola [1] https://uwekorn.com/2020/12/30/fast-jdbc-revisited.html [2] https://github.com/nvitucci/calcite-sparql/tree/v0.0.2/examples/python