Re: [DISCUSS][C++] Raw pointer string views

2023-09-26 Thread Gang Wu
Could you please simply describe the layout of DuckDB and Velox so we can know what kind of conversion is required from the raw pointer variant? If any engine simply represents string array in the form of something like std::vector, should we provide a similar variant in C++ to minimize the

Re: [VOTE] Release Apace Arrow nanoarrow 0.3.0 - RC0

2023-09-26 Thread Jacob Wujciak-Jens
+1 (non-binding) full verification with conda arrow 13.0.0 R 4.3 on pop_os 23.04, cmake 3.27, gcc 11 On Wed, Sep 27, 2023 at 1:26 AM Bryce Mecum wrote: > +1 (non-binding) > > Verified with `./verify-release-candidate.sh 0.3.0 0` on: > - Windows 10, x86_64, libarrow-main, MSVC 17 2022, R 4.3.1,

Re: [DISCUSS][Gandiva] External function registry proposal

2023-09-26 Thread Yue Ni
> I think the key idea is to let users call Gandiva functions to register functions and pass necessary info explicitly to Gandiva, rather than letting Gandiva discover them by itself. That makes sense. Thanks Jin and Antonie for your valuable feedback. I will revise the proposal accordingly

[DISCUSS][Flight SQL] Adding Ingest Support for Flight SQL

2023-09-26 Thread Joel Lubi
Hi devs, I would like to open a discussion around adding support for a native "ingest" command to the Flight SQL specification. The initial motivating use-case for this is to be able to support ADBC ingest when using the Flight SQL driver, which is currently not possible because the specific

Re: [VOTE] Release Apace Arrow nanoarrow 0.3.0 - RC0

2023-09-26 Thread Bryce Mecum
+1 (non-binding) Verified with `./verify-release-candidate.sh 0.3.0 0` on: - Windows 10, x86_64, libarrow-main, MSVC 17 2022, R 4.3.1, Rtools 43 - macOS 13.6, aarch64, libarrow 13.0.0, R 4.3.1 - Ubuntu 23.04, aarch64, libarrow 13.0.0, R 4.2.2

Re: [DISCUSS][C++] Raw pointer string views

2023-09-26 Thread Raphael Taylor-Davies
I'm confused why this would need to copy string data, assuming the pointers are into defined memory regions, something necessary for the C data interface's ownership semantics regardless, why can't these memory regions just be used as buffers as is? This would therefore require just rewriting

Re: [DISCUSS][C++] Raw pointer string views

2023-09-26 Thread Matt Topol
I believe the motivation is to avoid the cost of the data copy that would have to happen to convert from a pointer based to offset based scenario. Allowing the pointer-based implementation will ensure that we can maintain zero-copy communication with both DuckDB and Velox in a common workflow

Re: [VOTE] Release Apace Arrow nanoarrow 0.3.0 - RC0

2023-09-26 Thread David Li
+1 Tested on Ubuntu 20.04 LTS/x86_64, R 4.3.1 On Tue, Sep 26, 2023, at 18:05, Dane Pitkin wrote: > +1 (non-binding) > > I verified successfully on MacOS 13.5 (aarch64) with: > > cd dev/release && ./verify-release-candidate.sh 0.3.0 0 > > > > On Tue, Sep 26, 2023 at 5:30 PM Sutou Kouhei wrote: >

Re: [DISCUSS][C++] Raw pointer string views

2023-09-26 Thread Raphael Taylor-Davies
Hi, Is the motivation here to avoid DuckDB and Velox having to duplicate the conversion logic from pointer-based to offset-based, or to allow arrow-cpp to operate directly on pointer-based arrays? If it is the former, I personally wouldn't have thought the conversion logic sufficiently

Re: [VOTE] Release Apace Arrow nanoarrow 0.3.0 - RC0

2023-09-26 Thread Dane Pitkin
+1 (non-binding) I verified successfully on MacOS 13.5 (aarch64) with: cd dev/release && ./verify-release-candidate.sh 0.3.0 0 On Tue, Sep 26, 2023 at 5:30 PM Sutou Kouhei wrote: > +1 > > I ran the following command line on Debian GNU/Linux sid: > > CMAKE_PREFIX_PATH=/tmp/local \ >

Re: [VOTE] Release Apace Arrow nanoarrow 0.3.0 - RC0

2023-09-26 Thread Sutou Kouhei
+1 I ran the following command line on Debian GNU/Linux sid: CMAKE_PREFIX_PATH=/tmp/local \ dev/release/verify-release-candidate.sh 0.3.0 0 with: * Apache Arrow C++ main * gcc (Debian 13.2.0-4) 13.2.0 * R version 4.3.1 (2023-06-16) -- "Beagle Scouts" Thanks, -- kou In "[VOTE]

[DISCUSS][C++] Raw pointer string views

2023-09-26 Thread Benjamin Kietzman
Hello all, In the PR to add support for Utf8View to the c++ implementation, I've taken the approach of allowing raw pointer views [1] alongside the index/offset views described in the spec [2]. This was done to ease communication with other engines such as DuckDB and Velox whose native string

Re: [DISCUSS][Gandiva] External function registry proposal

2023-09-26 Thread Jin Shang
I agree with Antoine that we don't need to define a JSON format or a directory structure for Gandiva. To support external functions, we essentially need two things: 1. Gandiva's function registry needs to be aware of the function metadata: We can achieve this by having a

Re: [VOTE][RUST] Release Apache Arrow Rust Object Store 0.7.1 RC1

2023-09-26 Thread Andrew Lamb
+1 (binding) Verified on mac x86_64 Looks like a good release to me -- thank you Raphael Andrew On Tue, Sep 26, 2023 at 12:05 PM Raphael Taylor-Davies wrote: > Hi, > > I would like to propose a release of Apache Arrow Rust Object > Store Implementation, version 0.7.1. > > This release

Re: [VOTE][RUST] Release Apache Arrow Rust Object Store 0.7.1 RC1

2023-09-26 Thread L. C. Hsieh
+1 (binding) Verified on M1 Mac. Thanks Raphael. On Tue, Sep 26, 2023 at 9:05 AM Raphael Taylor-Davies wrote: > > Hi, > > I would like to propose a release of Apache Arrow Rust Object > Store Implementation, version 0.7.1. > > This release candidate is based on commit: >

Re: [Format] C Data Interface integration testing

2023-09-26 Thread Dewey Dunnington
Thank you for setting this up! I look forward to adding nanoarrow as soon as time allows. Cheers, -dewey On Tue, Sep 26, 2023 at 9:48 AM Antoine Pitrou wrote: > > > Hello, > > We have added some infrastructure for integration testing of the C Data > Interface between Arrow implementations. We

[VOTE][RUST] Release Apache Arrow Rust Object Store 0.7.1 RC1

2023-09-26 Thread Raphael Taylor-Davies
Hi, I would like to propose a release of Apache Arrow Rust Object Store Implementation, version 0.7.1. This release candidate is based on commit: 4ef7917bd57b701e30def8511b5fd8a7961f2fcf [1] The proposed release tarball and signatures are hosted at [2]. The changelog is located at [3].

[VOTE] Release Apace Arrow nanoarrow 0.3.0 - RC0

2023-09-26 Thread Dewey Dunnington
Hello, I would like to propose the following release candidate (rc0) of Apache Arrow nanoarrow [0] version 0.3.0. This is an initial release consisting of 42 resolved GitHub issues from 4 contributors [1]. This release candidate is based on commit: c00cd7707bcddb4dab9a7d19bf63e87c06d36c63 [2]

[Format] C Data Interface integration testing

2023-09-26 Thread Antoine Pitrou
Hello, We have added some infrastructure for integration testing of the C Data Interface between Arrow implementations. We are now testing the C++ and Go implementations, but the goal in the future is for all major implementations to be tested there (perhaps including nanoarrow). - PR to