Re: Apache Arrow adapter

2021-04-10 Thread Michael Mior
Yes, it was a bit of a challenge to get working in the Linux and macOS
development environments we've been using. This is why I temporarily
checked in the jar, but this should certainly be removed before the PR
is merged.

--
Michael Mior
mm...@apache.org

Le sam. 10 avr. 2021 à 17:05, Julian Hyde  a écrit :
>
> I've been trying to switch over to use the official Apache Arrow
> Gandiva 3.0.0 jar at Maven central. (Which means we can remove the
> 3.0.0-SNAPSHOT.jar that you had checked into arrow/libs.) That jar is
> built for macOS, and is a little more tricky to get running than the
> previous jar, which was built for Linux. I'll post to
> https://issues.apache.org/jira/browse/ARROW-11135 as I discover
> things.
>
> (Makes me glad we don't have any C++ code in Calcite. Making artifacts
> that work on multiple operating systems seems to be really
> challenging.)
>
> Julian
>
> On Sat, Apr 10, 2021 at 6:31 AM Michael Mior  wrote:
> >
> > Thanks Julian! I really appreciate the help. I think beta would be
> > accurate here but it would be great to have this pushed so people can
> > start trying it out.
> >
> > --
> > Michael Mior
> > mm...@apache.org
> >
> > Le ven. 9 avr. 2021 à 20:37, Julian Hyde  a écrit :
> > >
> > > Yes, thanks to Michael and Karshit for their great work.
> > >
> > > I am reviewing now, and doing some fix up (e.g. lint, repositories) so
> > > that we could get it into master as a "beta" component. I'll add
> > > updates in https://issues.apache.org/jira/browse/CALCITE-2040.
> > >
> > > On Wed, Apr 7, 2021 at 9:37 PM Fan Liya  wrote:
> > > >
> > > > Hi Michael,
> > > >
> > > > Thanks for sharing the great work.
> > > > I believe it is important work for both communities.
> > > >
> > > > Best,
> > > > Liya Fan
> > > >
> > > >
> > > > On Thu, Apr 8, 2021 at 3:30 AM Michael Mior  wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I wanted to share some work one of my (now former) students, Karshit
> > > > > Shah, has done with integrating Apache Arrow into Calcite. Karshit has
> > > > > written an Arrow adapter that's able to perform filtering and
> > > > > projections natively on Arrow data using Gandiva so these expressions
> > > > > can be JITed using LLVM. The pull request[0] needs some cleanup, but
> > > > > the code is in relatively good shape.
> > > > >
> > > > > Right now, the adapter only reads from files, but I think there are a
> > > > > number of exciting extensions to this that are possible. For example,
> > > > > Arrow has a client-server framework Flight which could be connected
> > > > > with Calcite, perhaps via Avatica. (Andy Grove was doing some work on
> > > > > this last year[1] although I'm not sure of the progress.)
> > > > >
> > > > > The biggest blocker on this is actually not the Calcite code, but the
> > > > > availability of a suitably built Arrow dependency with Gandiva along
> > > > > with the appropriate CI configuration. I opened a JIRA on the Arrow
> > > > > project with some more details[2].
> > > > >
> > > > > I'd love some thoughts on the approach and some help in pushing this
> > > > > over the finish line.
> > > > >
> > > > > [0] https://github.com/apache/calcite/pull/2133
> > > > > [1]
> > > > > https://mail-archives.apache.org/mod_mbox/calcite-dev/202002.mbox/%3cCAJEf=X5xvXLQpJkX_VjJk=TnNRwT52v0=p28sczmid1tyce...@mail.gmail.com%3e
> > > > > [2] https://issues.apache.org/jira/browse/ARROW-11135
> > > > > --
> > > > > Michael Mior
> > > > > mm...@apache.org
> > > > >


Re: Apache Arrow adapter

2021-04-10 Thread Julian Hyde
I've been trying to switch over to use the official Apache Arrow
Gandiva 3.0.0 jar at Maven central. (Which means we can remove the
3.0.0-SNAPSHOT.jar that you had checked into arrow/libs.) That jar is
built for macOS, and is a little more tricky to get running than the
previous jar, which was built for Linux. I'll post to
https://issues.apache.org/jira/browse/ARROW-11135 as I discover
things.

(Makes me glad we don't have any C++ code in Calcite. Making artifacts
that work on multiple operating systems seems to be really
challenging.)

Julian

On Sat, Apr 10, 2021 at 6:31 AM Michael Mior  wrote:
>
> Thanks Julian! I really appreciate the help. I think beta would be
> accurate here but it would be great to have this pushed so people can
> start trying it out.
>
> --
> Michael Mior
> mm...@apache.org
>
> Le ven. 9 avr. 2021 à 20:37, Julian Hyde  a écrit :
> >
> > Yes, thanks to Michael and Karshit for their great work.
> >
> > I am reviewing now, and doing some fix up (e.g. lint, repositories) so
> > that we could get it into master as a "beta" component. I'll add
> > updates in https://issues.apache.org/jira/browse/CALCITE-2040.
> >
> > On Wed, Apr 7, 2021 at 9:37 PM Fan Liya  wrote:
> > >
> > > Hi Michael,
> > >
> > > Thanks for sharing the great work.
> > > I believe it is important work for both communities.
> > >
> > > Best,
> > > Liya Fan
> > >
> > >
> > > On Thu, Apr 8, 2021 at 3:30 AM Michael Mior  wrote:
> > >
> > > > Hi all,
> > > >
> > > > I wanted to share some work one of my (now former) students, Karshit
> > > > Shah, has done with integrating Apache Arrow into Calcite. Karshit has
> > > > written an Arrow adapter that's able to perform filtering and
> > > > projections natively on Arrow data using Gandiva so these expressions
> > > > can be JITed using LLVM. The pull request[0] needs some cleanup, but
> > > > the code is in relatively good shape.
> > > >
> > > > Right now, the adapter only reads from files, but I think there are a
> > > > number of exciting extensions to this that are possible. For example,
> > > > Arrow has a client-server framework Flight which could be connected
> > > > with Calcite, perhaps via Avatica. (Andy Grove was doing some work on
> > > > this last year[1] although I'm not sure of the progress.)
> > > >
> > > > The biggest blocker on this is actually not the Calcite code, but the
> > > > availability of a suitably built Arrow dependency with Gandiva along
> > > > with the appropriate CI configuration. I opened a JIRA on the Arrow
> > > > project with some more details[2].
> > > >
> > > > I'd love some thoughts on the approach and some help in pushing this
> > > > over the finish line.
> > > >
> > > > [0] https://github.com/apache/calcite/pull/2133
> > > > [1]
> > > > https://mail-archives.apache.org/mod_mbox/calcite-dev/202002.mbox/%3cCAJEf=X5xvXLQpJkX_VjJk=TnNRwT52v0=p28sczmid1tyce...@mail.gmail.com%3e
> > > > [2] https://issues.apache.org/jira/browse/ARROW-11135
> > > > --
> > > > Michael Mior
> > > > mm...@apache.org
> > > >


Re: Apache Arrow adapter

2021-04-10 Thread Michael Mior
Thanks Julian! I really appreciate the help. I think beta would be
accurate here but it would be great to have this pushed so people can
start trying it out.

--
Michael Mior
mm...@apache.org

Le ven. 9 avr. 2021 à 20:37, Julian Hyde  a écrit :
>
> Yes, thanks to Michael and Karshit for their great work.
>
> I am reviewing now, and doing some fix up (e.g. lint, repositories) so
> that we could get it into master as a "beta" component. I'll add
> updates in https://issues.apache.org/jira/browse/CALCITE-2040.
>
> On Wed, Apr 7, 2021 at 9:37 PM Fan Liya  wrote:
> >
> > Hi Michael,
> >
> > Thanks for sharing the great work.
> > I believe it is important work for both communities.
> >
> > Best,
> > Liya Fan
> >
> >
> > On Thu, Apr 8, 2021 at 3:30 AM Michael Mior  wrote:
> >
> > > Hi all,
> > >
> > > I wanted to share some work one of my (now former) students, Karshit
> > > Shah, has done with integrating Apache Arrow into Calcite. Karshit has
> > > written an Arrow adapter that's able to perform filtering and
> > > projections natively on Arrow data using Gandiva so these expressions
> > > can be JITed using LLVM. The pull request[0] needs some cleanup, but
> > > the code is in relatively good shape.
> > >
> > > Right now, the adapter only reads from files, but I think there are a
> > > number of exciting extensions to this that are possible. For example,
> > > Arrow has a client-server framework Flight which could be connected
> > > with Calcite, perhaps via Avatica. (Andy Grove was doing some work on
> > > this last year[1] although I'm not sure of the progress.)
> > >
> > > The biggest blocker on this is actually not the Calcite code, but the
> > > availability of a suitably built Arrow dependency with Gandiva along
> > > with the appropriate CI configuration. I opened a JIRA on the Arrow
> > > project with some more details[2].
> > >
> > > I'd love some thoughts on the approach and some help in pushing this
> > > over the finish line.
> > >
> > > [0] https://github.com/apache/calcite/pull/2133
> > > [1]
> > > https://mail-archives.apache.org/mod_mbox/calcite-dev/202002.mbox/%3cCAJEf=X5xvXLQpJkX_VjJk=TnNRwT52v0=p28sczmid1tyce...@mail.gmail.com%3e
> > > [2] https://issues.apache.org/jira/browse/ARROW-11135
> > > --
> > > Michael Mior
> > > mm...@apache.org
> > >


Re: Apache Arrow adapter

2021-04-09 Thread Julian Hyde
Yes, thanks to Michael and Karshit for their great work.

I am reviewing now, and doing some fix up (e.g. lint, repositories) so
that we could get it into master as a "beta" component. I'll add
updates in https://issues.apache.org/jira/browse/CALCITE-2040.

On Wed, Apr 7, 2021 at 9:37 PM Fan Liya  wrote:
>
> Hi Michael,
>
> Thanks for sharing the great work.
> I believe it is important work for both communities.
>
> Best,
> Liya Fan
>
>
> On Thu, Apr 8, 2021 at 3:30 AM Michael Mior  wrote:
>
> > Hi all,
> >
> > I wanted to share some work one of my (now former) students, Karshit
> > Shah, has done with integrating Apache Arrow into Calcite. Karshit has
> > written an Arrow adapter that's able to perform filtering and
> > projections natively on Arrow data using Gandiva so these expressions
> > can be JITed using LLVM. The pull request[0] needs some cleanup, but
> > the code is in relatively good shape.
> >
> > Right now, the adapter only reads from files, but I think there are a
> > number of exciting extensions to this that are possible. For example,
> > Arrow has a client-server framework Flight which could be connected
> > with Calcite, perhaps via Avatica. (Andy Grove was doing some work on
> > this last year[1] although I'm not sure of the progress.)
> >
> > The biggest blocker on this is actually not the Calcite code, but the
> > availability of a suitably built Arrow dependency with Gandiva along
> > with the appropriate CI configuration. I opened a JIRA on the Arrow
> > project with some more details[2].
> >
> > I'd love some thoughts on the approach and some help in pushing this
> > over the finish line.
> >
> > [0] https://github.com/apache/calcite/pull/2133
> > [1]
> > https://mail-archives.apache.org/mod_mbox/calcite-dev/202002.mbox/%3cCAJEf=X5xvXLQpJkX_VjJk=TnNRwT52v0=p28sczmid1tyce...@mail.gmail.com%3e
> > [2] https://issues.apache.org/jira/browse/ARROW-11135
> > --
> > Michael Mior
> > mm...@apache.org
> >


Re: Apache Arrow adapter

2021-04-07 Thread Fan Liya
Hi Michael,

Thanks for sharing the great work.
I believe it is important work for both communities.

Best,
Liya Fan


On Thu, Apr 8, 2021 at 3:30 AM Michael Mior  wrote:

> Hi all,
>
> I wanted to share some work one of my (now former) students, Karshit
> Shah, has done with integrating Apache Arrow into Calcite. Karshit has
> written an Arrow adapter that's able to perform filtering and
> projections natively on Arrow data using Gandiva so these expressions
> can be JITed using LLVM. The pull request[0] needs some cleanup, but
> the code is in relatively good shape.
>
> Right now, the adapter only reads from files, but I think there are a
> number of exciting extensions to this that are possible. For example,
> Arrow has a client-server framework Flight which could be connected
> with Calcite, perhaps via Avatica. (Andy Grove was doing some work on
> this last year[1] although I'm not sure of the progress.)
>
> The biggest blocker on this is actually not the Calcite code, but the
> availability of a suitably built Arrow dependency with Gandiva along
> with the appropriate CI configuration. I opened a JIRA on the Arrow
> project with some more details[2].
>
> I'd love some thoughts on the approach and some help in pushing this
> over the finish line.
>
> [0] https://github.com/apache/calcite/pull/2133
> [1]
> https://mail-archives.apache.org/mod_mbox/calcite-dev/202002.mbox/%3cCAJEf=X5xvXLQpJkX_VjJk=TnNRwT52v0=p28sczmid1tyce...@mail.gmail.com%3e
> [2] https://issues.apache.org/jira/browse/ARROW-11135
> --
> Michael Mior
> mm...@apache.org
>