Re: Using Calcite with Python

Michael Mior Mon, 31 Jan 2022 09:04:06 -0800

Flight is definitely another consideration for the future. Personally I
think it would be most interesting to integrate Flight with Avatica as an
alternative transport. But it would certainly also be useful to allow the
Arrow adapter to connect to any Flight endpoint.


--
Michael Mior
mm...@apache.org


Le lun. 31 janv. 2022 à 10:00, Gavin Ray <ray.gavi...@gmail.com> a écrit :

> This is really interesting stuff you've done in the example notebooks
>
> Nicola & Michael, I wonder if you could benefit from the recently-released
> Arrow Flight SQL?
>
> https://www.dremio.com/subsurface/arrow-flight-and-arrow-flight-sql-accelerating-data-movement/
>
> I have asked Jacques about this a bit -- it's meant to be a standardization
> for communicating SQL queries and metadata with Arrow.
> I'm not intimately familiar with it, but it seems like it could be a good
> base to build a Calcite backend for Arrow from?
>
> They have a pretty thorough Java example in the repository:
>
> https://github.com/apache/arrow/blob/968e6ea488c939c0e1f2bfe339a5a9ed1aed603e/java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlExample.java#L169-L180
>
> On Mon, Jan 31, 2022 at 8:47 AM Michael Mior <mm...@apache.org> wrote:
>
> > You may want to keep an eye on CALCITE-2040 (
> > https://issues.apache.org/jira/browse/CALCITE-2040). I have a student
> who
> > is working on a Calcite adapter for Apache Arrow. We're basically hung up
> > waiting on the Arrow team to release a compatible JAR. This still won't
> > fully solve your problem though as the first version of the adapter is
> only
> > capable of reading from Arrow files. However, the goal is eventually to
> > allow passing a memory reference into the adapter so that it would be
> > possible to make use of Arrow data which is constructed in-memory
> > elsewhere.
> > --
> > Michael Mior
> > mm...@apache.org
> >
> >
> > Le dim. 30 janv. 2022 à 17:36, Nicola Vitucci <nicola.vitu...@gmail.com>
> a
> > écrit :
> >
> > > Hi all,
> > >
> > > What would be the best way to use Calcite with Python? I've come up
> with
> > > two potential solutions:
> > >
> > > - using the jaydebeapi package, to connect via the JDBC driver directly
> > > from a JVM created via jpype;
> > > - using Apache Arrow via the pyarrow package, to connect in basically
> the
> > > same way but creating Arrow objects with JdbcToArrowUtils (and
> optionally
> > > converting them to Pandas).
> > >
> > > Although the former is more straightforward, the latter allows to
> achieve
> > > better performance (see [1] for instance) since it's exactly what Arrow
> > is
> > > for. I've created two Jupyter notebooks [2] showing each solution. What
> > > would you recommend? Is there an even better approach?
> > >
> > > Thanks,
> > >
> > > Nicola
> > >
> > > [1] https://uwekorn.com/2020/12/30/fast-jdbc-revisited.html
> > > [2]
> > https://github.com/nvitucci/calcite-sparql/tree/v0.0.2/examples/python
> > >
> >
>

Re: Using Calcite with Python

Reply via email to