I have nothing of value to add, but: > [5] https://github.com/oap-project/gazelle-jni/tree/velox_dev
Hot damn this is neat On Mon, Jan 31, 2022 at 7:58 PM Jacques Nadeau <jacq...@apache.org> wrote: > A couple of related (possibly useful?) pointers here: > > - Dask-sql [1] uses Calcite in a python context. Might be some good > stuff to leverage there. > - I'm working on compiling Calcite as a GraalVM shared native library > [2] as part of Substrait [3] with the goal of ultimately having a > friendly > C binding [4] for use in non-jvm worlds. This connects to work being > done > by others to support tools like Arrow and Velox [5] as Substrait targets > (and thus completing the path from c interface to native execution via > Calcite). > > > [1] https://github.com/dask-contrib/dask-sql > [2] https://issues.apache.org/jira/browse/CALCITE-4786 > [3] https://github.com/substrait-io/substrait/pull/120 > [4] https://github.com/jacques-n/substrait/pull/3 > [5] https://github.com/oap-project/gazelle-jni/tree/velox_dev > > On Mon, Jan 31, 2022 at 3:32 PM Nicola Vitucci <nicola.vitu...@gmail.com> > wrote: > > > Hi Eugen, Michael, Gavin, > > > > Thank you very much for your input. Answering to your suggestions: > > > > - Phoenix client: I saw it but decided not to use it because it does not > > seem very active and up to date (its Avatica version is 1.10, while > latest > > is 1.20). I may still give it a try though. > > - Arrow Flight: I think it can be very useful especially, like Michael > > mentioned, if it were integrated with Avatica as a transport; at the > > moment, though, it is not. > > > > I am basically looking for a (relatively) easy and ready to implement, > easy > > to keep up to date, and reasonably performant solution. Although it > incurs > > some overhead, a solution based on Python + Java seems to me the most > > reasonable for the time being. Do you have any other suggestions or > > recommendations? > > > > Thanks again, > > > > Nicola > > > > > > > > Il giorno lun 31 gen 2022 alle ore 17:04 Michael Mior <mm...@apache.org> > > ha > > scritto: > > > > > Flight is definitely another consideration for the future. Personally I > > > think it would be most interesting to integrate Flight with Avatica as > an > > > alternative transport. But it would certainly also be useful to allow > the > > > Arrow adapter to connect to any Flight endpoint. > > > > > > -- > > > Michael Mior > > > mm...@apache.org > > > > > > > > > Le lun. 31 janv. 2022 à 10:00, Gavin Ray <ray.gavi...@gmail.com> a > > écrit : > > > > > > > This is really interesting stuff you've done in the example notebooks > > > > > > > > Nicola & Michael, I wonder if you could benefit from the > > > recently-released > > > > Arrow Flight SQL? > > > > > > > > > > > > > > https://www.dremio.com/subsurface/arrow-flight-and-arrow-flight-sql-accelerating-data-movement/ > > > > > > > > I have asked Jacques about this a bit -- it's meant to be a > > > standardization > > > > for communicating SQL queries and metadata with Arrow. > > > > I'm not intimately familiar with it, but it seems like it could be a > > good > > > > base to build a Calcite backend for Arrow from? > > > > > > > > They have a pretty thorough Java example in the repository: > > > > > > > > > > > > > > https://github.com/apache/arrow/blob/968e6ea488c939c0e1f2bfe339a5a9ed1aed603e/java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlExample.java#L169-L180 > > > > > > > > On Mon, Jan 31, 2022 at 8:47 AM Michael Mior <mm...@apache.org> > wrote: > > > > > > > > > You may want to keep an eye on CALCITE-2040 ( > > > > > https://issues.apache.org/jira/browse/CALCITE-2040). I have a > > student > > > > who > > > > > is working on a Calcite adapter for Apache Arrow. We're basically > > hung > > > up > > > > > waiting on the Arrow team to release a compatible JAR. This still > > won't > > > > > fully solve your problem though as the first version of the adapter > > is > > > > only > > > > > capable of reading from Arrow files. However, the goal is > eventually > > to > > > > > allow passing a memory reference into the adapter so that it would > be > > > > > possible to make use of Arrow data which is constructed in-memory > > > > > elsewhere. > > > > > -- > > > > > Michael Mior > > > > > mm...@apache.org > > > > > > > > > > > > > > > Le dim. 30 janv. 2022 à 17:36, Nicola Vitucci < > > > nicola.vitu...@gmail.com> > > > > a > > > > > écrit : > > > > > > > > > > > Hi all, > > > > > > > > > > > > What would be the best way to use Calcite with Python? I've come > up > > > > with > > > > > > two potential solutions: > > > > > > > > > > > > - using the jaydebeapi package, to connect via the JDBC driver > > > directly > > > > > > from a JVM created via jpype; > > > > > > - using Apache Arrow via the pyarrow package, to connect in > > basically > > > > the > > > > > > same way but creating Arrow objects with JdbcToArrowUtils (and > > > > optionally > > > > > > converting them to Pandas). > > > > > > > > > > > > Although the former is more straightforward, the latter allows to > > > > achieve > > > > > > better performance (see [1] for instance) since it's exactly what > > > Arrow > > > > > is > > > > > > for. I've created two Jupyter notebooks [2] showing each > solution. > > > What > > > > > > would you recommend? Is there an even better approach? > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Nicola > > > > > > > > > > > > [1] https://uwekorn.com/2020/12/30/fast-jdbc-revisited.html > > > > > > [2] > > > > > > > https://github.com/nvitucci/calcite-sparql/tree/v0.0.2/examples/python > > > > > > > > > > > > > > > > > > > > >