I only briefly looked into the Gandiva protobuf, but one issue seems to be 
using protobuf (Wes is against this for dependency reasons). There's also some 
inconsistencies between the Gandiva protobuf and how filter expressions should 
be represented, i.e. in the Gandiva protobuf fields are typed when I think 
fields should just contain a field name.

-----Original Message-----
From: Jacques Nadeau <jacq...@apache.org>
Sent: Thursday, July 23, 2020 10:14 PM
To: dev <dev@arrow.apache.org>
Subject: [ext] Re: language independent representation of filter expressions

Have you tried to use the existing expression representation provided by 
Gandiva? What are the issues you've seen with it?

On Wed, Jul 22, 2020 at 10:24 AM Patrick Pai <p...@drwholdings.com> wrote:

> Hi all,
>
> After some discussion with Steve, we'd like to propose and get
> feedback on an alternative to representing expressions entirely with 
> flatbuffers.
>
> To give some context, we thought about how we'd construct flatbuffer
> expressions in Java or another language if we went down that route. We
> realized that it'd be possible, but not user friendly. An example is
> specifying an array of int values in Java for an InExpression. In
> Java, we'd ideally have some user-friendly class (i.e. arrow's
> IntVector) that then gets converted to the appropriate flatbuffer
> representation. I think this is what Jacques was saying about language
> support being too weak - it's possible for Java users to construct a
> flatbuffer expression, but not easily without an additional conversion layer 
> for every language.
>
> An alternative we're thinking about is to only represent enum values (i.e.
> those defined in arrow::dataset::ExpressionType::type) in a flatbuffer
> schema, and rely on the existing IPC format (used to
> serialize/deserialize cpp expressions) to pass the struct array
> representation of an expression from for example Java to C++. The one
> difference is in the struct array representation, we use the enum
> values defined in our flatbuffer schema instead of existing cpp enums.
> This approach requires us on the Java side (and languages other than
> C++) to construct the struct array, but the benefit is minimal changes
> to the C++ code (the main change being using our flatbuffer schema enum 
> values).
>
>
> On 2020/07/13 09:21:19, Antoine Pitrou <solip...@pitrou.net> wrote:
> > On Sat, 11 Jul 2020 09:55:16 -0700
> > Jacques Nadeau <jacq...@apache.org> wrote:
> > >
> > > I'm against extending use of flatbuf within Arrow. The language
> support is
> > > too weak. Language support isn't just about having a binding for
> different
> > > languages, it is about having a high-quality binding.
> >
> > Could you please expand on this?  ("the language support is too
> > weak")
> >
> > Thank you
> >
> > Antoine.
> >
> >
> >
>
> This e-mail and any attachments may contain information that is
> confidential and proprietary and otherwise protected from disclosure.
> If you are not the intended recipient of this e-mail, do not read,
> duplicate or redistribute it by any means. Please immediately delete
> it and any attachments and notify the sender that you have received it by 
> mistake.
> Unintended recipients are prohibited from taking action on the basis
> of information in this e-mail or any attachments. The DRW Companies
> make no representations that this e-mail or any attachments are free
> of computer viruses or other defects.
>
This e-mail and any attachments may contain information that is confidential 
and proprietary and otherwise protected from disclosure. If you are not the 
intended recipient of this e-mail, do not read, duplicate or redistribute it by 
any means. Please immediately delete it and any attachments and notify the 
sender that you have received it by mistake. Unintended recipients are 
prohibited from taking action on the basis of information in this e-mail or any 
attachments. The DRW Companies make no representations that this e-mail or any 
attachments are free of computer viruses or other defects.

Reply via email to