I'm working with Steve on this issue. Can you please share what you have in mind for something more general than Gandiva's serialized expressions?
I'm currently working through a design. I imagine we will have a FlatBuffer schema defining all expression types and have the different cpp expression classes (i.e. ComparisonExpression) act as wrappers around the generated flatbuf structs. I also noticed that the data types used in filters are not backed by format/Expression.fbs and instead use the types defined in cpp/arrow/type.h I'm thinking it would be good to make the move to using Expression.fbs so that the data types themselves are also language independent. I'd appreciate any feedback or thoughts. On 2020/07/06 21:44:40, Wes McKinney <wesmck...@gmail.com> wrote: > I would also be interested in having a reusable serialized format for > filter- and projection-like expressions. I think trying to go so far > as full logical query plans suitable for building a SQL engine is > perhaps a bit too far but we could start small with the use case from > the JNI Datasets PR as a motivating example. We should also consider > replacing or deprecating Gandiva's serialized expressions in favor of > something more general. > > It may be a slight bikeshed issue, but I wouldn't be thrilled about > having this be based on Protocol Buffers, because of the runtime > requirement (on libprotobuf.so / libprotobuf.a) it introduces into C++ > applications. Flatbuffers might be less pleasant developer UX in Java > but at least in C++ the fact that Flatbuffers results in zero build- > or runtime dependencies is a significant advantage.