While it's unfortunate to have to re-examine some basic design issues
at this stage, I agree with Jacques's point that it would be nice if
we can accommodate (without great hardship) the use case where a
stream/pipeline of record batches are passed in C that does not
require the called function to have to parse or validate the schema
each time. Gandiva uses its own data structure [1] for passing a
schemaless record batch across JNI and in theory this could be
replaced by the C data structure

[1]: https://github.com/apache/arrow/blob/master/cpp/src/gandiva/eval_batch.h

On Sun, Dec 8, 2019 at 8:09 PM Fan Liya <liya.fa...@gmail.com> wrote:
>
> +1, as this is useful IMO.
>
> Best,
> Liya Fan
>
> On Sat, Dec 7, 2019 at 12:21 PM Jacques Nadeau <jacq...@apache.org> wrote:
>
> > -1 (binding)
> >
> > I'm voting -1 on this. I posted the thinking why on the PR. The high-level
> > is that I think it needs to better address the pipelined use case as right
> > now it fails to support that at all and has too much weight to ignore that
> > use case.
> >
> > I actually would have posted it here but totally missed this vote thread
> > until just now (I'm traveling atm). My -1 is not an indefinite -1, I'm
> > simply asking for some small changes to the approach to also support the
> > pipelined usage pattern.
> >
> > On Sat, Dec 7, 2019 at 3:09 AM Wes McKinney <wesmck...@gmail.com> wrote:
> >
> > > Hello,
> > >
> > > Could more PMC members take a look at this work?
> > >
> > > Thank you
> > >
> > > On Tue, Dec 3, 2019 at 1:50 PM Neal Richardson
> > > <neal.p.richard...@gmail.com> wrote:
> > > >
> > > > +1 (non-binding)
> > > >
> > > > On Tue, Dec 3, 2019 at 10:56 AM Wes McKinney <wesmck...@gmail.com>
> > > wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > On Tue, Dec 3, 2019 at 12:54 PM Wes McKinney <wesmck...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > hello,
> > > > > >
> > > > > > We have been discussing the creation of a minimalist C-based data
> > > > > > interface for applications to exchange Arrow columnar data
> > structures
> > > > > > with each other. Some notable features of this interface include:
> > > > > >
> > > > > > * A small amount of header-only C code can be copied into
> > downstream
> > > > > > applications, no external dependencies are needed (notable, it is
> > not
> > > > > > required to use Flatbuffers, though there are trade-offs resulting
> > > > > > from this)
> > > > > > * Low development investment (in other words: limited-scope use
> > cases
> > > > > > can be accomplished with little code). Enable C libraries to export
> > > > > > Arrow columnar data at C call sites with minimal code
> > > > > >
> > > > > > This "C Data Interface" serves different use cases from the
> > > > > > language-independent IPC protocol and trades away a number of
> > > features
> > > > > > (such as forward/backward compatibility) in the interest of
> > > minimalism
> > > > > > / simplicity. It is not a replacement for the IPC protocol and will
> > > > > > only be used to interchange in-process data at C call sites.
> > > > > >
> > > > > > The PR providing the specification is here
> > > > > >
> > > > > > https://github.com/apache/arrow/pull/5442
> > > > > >
> > > > > > A fairly comprehensive C++ implementation of this demonstrating its
> > > > > > use is found here
> > > > > >
> > > > > > https://github.com/apache/arrow/pull/5608
> > > > > >
> > > > > > (note that other applications implementing the interface may choose
> > > to
> > > > > > only support a few features and thus have far less code to write)
> > > > > >
> > > > > > Please vote to adopt the SPECIFICATION (GitHub PR #5442).
> > > > > >
> > > > > > This vote will be open for at least 72 hours
> > > > > >
> > > > > > [ ] +1 Adopt C Data Interface specification
> > > > > > [ ] +0
> > > > > > [ ] -1 Do not adopt because...
> > > > > >
> > > > > > Thank you
> > > > >
> > >
> >

Reply via email to