Re: [VOTE] Adopt Arrow in-process C Data Interface specification

Antoine Pitrou Mon, 09 Dec 2019 04:21:11 -0800


Right, I'll give it a try in a few days.


Best regards

Antoine.


Le 09/12/2019 à 12:46, Wes McKinney a écrit :
> While it's unfortunate to have to re-examine some basic design issues
> at this stage, I agree with Jacques's point that it would be nice if
> we can accommodate (without great hardship) the use case where a
> stream/pipeline of record batches are passed in C that does not
> require the called function to have to parse or validate the schema
> each time. Gandiva uses its own data structure [1] for passing a
> schemaless record batch across JNI and in theory this could be
> replaced by the C data structure
> 
> [1]: https://github.com/apache/arrow/blob/master/cpp/src/gandiva/eval_batch.h
> 
> On Sun, Dec 8, 2019 at 8:09 PM Fan Liya <liya.fa...@gmail.com> wrote:
>>
>> +1, as this is useful IMO.
>>
>> Best,
>> Liya Fan
>>
>> On Sat, Dec 7, 2019 at 12:21 PM Jacques Nadeau <jacq...@apache.org> wrote:
>>
>>> -1 (binding)
>>>
>>> I'm voting -1 on this. I posted the thinking why on the PR. The high-level
>>> is that I think it needs to better address the pipelined use case as right
>>> now it fails to support that at all and has too much weight to ignore that
>>> use case.
>>>
>>> I actually would have posted it here but totally missed this vote thread
>>> until just now (I'm traveling atm). My -1 is not an indefinite -1, I'm
>>> simply asking for some small changes to the approach to also support the
>>> pipelined usage pattern.
>>>
>>> On Sat, Dec 7, 2019 at 3:09 AM Wes McKinney <wesmck...@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> Could more PMC members take a look at this work?
>>>>
>>>> Thank you
>>>>
>>>> On Tue, Dec 3, 2019 at 1:50 PM Neal Richardson
>>>> <neal.p.richard...@gmail.com> wrote:
>>>>>
>>>>> +1 (non-binding)
>>>>>
>>>>> On Tue, Dec 3, 2019 at 10:56 AM Wes McKinney <wesmck...@gmail.com>
>>>> wrote:
>>>>>
>>>>>> +1 (binding)
>>>>>>
>>>>>> On Tue, Dec 3, 2019 at 12:54 PM Wes McKinney <wesmck...@gmail.com>
>>>> wrote:
>>>>>>>
>>>>>>> hello,
>>>>>>>
>>>>>>> We have been discussing the creation of a minimalist C-based data
>>>>>>> interface for applications to exchange Arrow columnar data
>>> structures
>>>>>>> with each other. Some notable features of this interface include:
>>>>>>>
>>>>>>> * A small amount of header-only C code can be copied into
>>> downstream
>>>>>>> applications, no external dependencies are needed (notable, it is
>>> not
>>>>>>> required to use Flatbuffers, though there are trade-offs resulting
>>>>>>> from this)
>>>>>>> * Low development investment (in other words: limited-scope use
>>> cases
>>>>>>> can be accomplished with little code). Enable C libraries to export
>>>>>>> Arrow columnar data at C call sites with minimal code
>>>>>>>
>>>>>>> This "C Data Interface" serves different use cases from the
>>>>>>> language-independent IPC protocol and trades away a number of
>>>> features
>>>>>>> (such as forward/backward compatibility) in the interest of
>>>> minimalism
>>>>>>> / simplicity. It is not a replacement for the IPC protocol and will
>>>>>>> only be used to interchange in-process data at C call sites.
>>>>>>>
>>>>>>> The PR providing the specification is here
>>>>>>>
>>>>>>> https://github.com/apache/arrow/pull/5442
>>>>>>>
>>>>>>> A fairly comprehensive C++ implementation of this demonstrating its
>>>>>>> use is found here
>>>>>>>
>>>>>>> https://github.com/apache/arrow/pull/5608
>>>>>>>
>>>>>>> (note that other applications implementing the interface may choose
>>>> to
>>>>>>> only support a few features and thus have far less code to write)
>>>>>>>
>>>>>>> Please vote to adopt the SPECIFICATION (GitHub PR #5442).
>>>>>>>
>>>>>>> This vote will be open for at least 72 hours
>>>>>>>
>>>>>>> [ ] +1 Adopt C Data Interface specification
>>>>>>> [ ] +0
>>>>>>> [ ] -1 Do not adopt because...
>>>>>>>
>>>>>>> Thank you
>>>>>>
>>>>
>>>

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

Reply via email to