Hi,
I think JIT-compiling of kernels operating on Arrow data is an important development path, but just for the record, LLVM doesn't have a stable C++ API (the API changes at each feature release). Just something to keep a mind for the ensuing packaging discussions ;-) (it also raises interesting questions such as "what happens if a user wants to use both PyArrow and Numba in a given process, and they don't target the same LLVM API version") Regards Antoine. Le 22/06/2018 à 01:26, Wes McKinney a écrit : > hi Jacques, > > This is very exciting! LLVM codegen for Arrow has been on my wishlist > since the early days of the project. I always considered it more of a > "when" question more than "if". > > I will take a closer look at the codebase to make some comments, but > my biggest initial question is whether we could work to make Gandiva > the official community-supported LLVM framework for creating > JIT-compiled Arrow kernels. In the Ursa Labs (a new lab I am building > to focus 90+% on Apache Arrow development) tech roadmap we discussed > the need for a subgraph compiler using LLVM: > https://ursalabs.org/tech/#subgraph-compilation-code-generation. > > I would be interesting in getting involved in the project, and I > expect in time many others will, as well. An obvious question would be > whether you would be interested in donating the project to Apache > Arrow and continuing the work there. We would benefit from common > build, testing/CI, and packaging/deployment infrastructure. I'm keen > to see JIT-powered predicate pushdown in Parquet files, for example. > Phillip and I could look into building a Gandiva backend for compiling > a subset of expressions originating from Ibis, a lazy-evaluation DSL > system with similar API to pandas > (https://github.com/ibis-project/ibis). > > best > Wes > > On Thu, Jun 21, 2018 at 4:13 PM, Dimitri Vorona > <alen...@googlemail.com.invalid> wrote: >> Hey Jaques, >> >> Great stuff! I'm actually researching the integration of arrow and flight >> into a main memory database which also uses LLVM for dynamic query >> generation! Excited to have a more detailed look at Gandiva! >> >> Cheers, >> Dimitri. >> >> On Thu, Jun 21, 2018, 21:15 Jacques Nadeau <jacq...@apache.org> wrote: >> >>> Hey Guys, >>> >>> Dremio just open sourced a new framework for processing data in Arrow data >>> structures [1], built on top of the Apache Arrow C++ APIs and leveraging >>> LLVM (Apache licensed). It also includes Java APIs that leverage the Apache >>> Arrow Java libraries. I expect the developers who have been working on this >>> will introduce themselves soon. To read more about it, take a look at our >>> Ravindra's blog post (he's the lead developer driving this work): [2]. >>> Hopefully people will find this interesting/useful. >>> >>> Let us know what you all think! >>> >>> thanks, >>> Jacques >>> >>> >>> [1] https://github.com/dremio/gandiva >>> [2] https://www.dremio.com/announcing-gandiva-initiative-for-apache-arrow/ >>>