I'm working with Steve on this issue. Can you please share what you have in 
mind for something more general than Gandiva's serialized expressions?

I'm currently working through a design. I imagine we will have a FlatBuffer 
schema defining all expression types and have the different cpp expression 
classes (i.e. ComparisonExpression) act as wrappers around the generated 
flatbuf structs.

I also noticed that the data types used in filters are not backed by 
format/Expression.fbs and instead use the types defined in cpp/arrow/type.h
I'm thinking it would be good to make the move to using Expression.fbs so that 
the data types themselves are also language independent. I'd appreciate any 
feedback or thoughts.

On 2020/07/06 21:44:40, Wes McKinney <wesmck...@gmail.com> wrote: 
> I would also be interested in having a reusable serialized format for
> filter- and projection-like expressions. I think trying to go so far
> as full logical query plans suitable for building a SQL engine is
> perhaps a bit too far but we could start small with the use case from
> the JNI Datasets PR as a motivating example. We should also consider
> replacing or deprecating Gandiva's serialized expressions in favor of
> something more general.
> 
> It may be a slight bikeshed issue, but I wouldn't be thrilled about
> having this be based on Protocol Buffers, because of the runtime
> requirement (on libprotobuf.so / libprotobuf.a) it introduces into C++
> applications. Flatbuffers might be less pleasant developer UX in Java
> but at least in C++ the fact that Flatbuffers results in zero build-
> or runtime dependencies is a significant advantage.

Reply via email to