Re: [VOTE][RUST] Release Apache Arrow Rust 11.0.0 RC1

2022-03-18 Thread Yijie Shen
+1 non-binding

Thanks,
Yijie

On Sat, Mar 19, 2022 at 10:23 AM LM  wrote:

> +1 (non-binding)
>
> Verified on macOS 12.3 on M1Max
>
> Thanks,
> Lin
>
> On Fri, Mar 18, 2022 at 5:52 PM QP Hou  wrote:
>
> > +1 (binding)
> > Thanks,
> > QP Hou
> >
> > On Fri, Mar 18, 2022 at 1:01 AM Andrew Lamb 
> wrote:
> > >
> > > Hi,
> > >
> > > I would like to propose a release of Apache Arrow Rust Implementation,
> > > version 11.0.0.
> > >
> > > This release candidate is based on commit:
> > > 5d6b638111e3f9c72dc8504ea98e46914fc93af5 [1]
> > >
> > > The proposed release tarball and signatures are hosted at [2].
> > >
> > > The changelog is located at [3].
> > >
> > > Please download, verify checksums and signatures, run the unit tests,
> > > and vote on the release. There is a script [4] that automates some of
> > > the verification.
> > >
> > > The vote will be open for at least 72 hours.
> > >
> > > [ ] +1 Release this as Apache Arrow Rust
> > > [ ] +0
> > > [ ] -1 Do not release this as Apache Arrow Rust  because...
> > >
> > > [1]:
> > >
> >
> https://github.com/apache/arrow-rs/tree/5d6b638111e3f9c72dc8504ea98e46914fc93af5
> > > [2]:
> > https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-rs-11.0.0-rc1
> > > [3]:
> > >
> >
> https://github.com/apache/arrow-rs/blob/5d6b638111e3f9c72dc8504ea98e46914fc93af5/CHANGELOG.md
> > > [4]:
> > >
> >
> https://github.com/apache/arrow-rs/blob/master/dev/release/verify-release-candidate.sh
> > > -
> >
>


Re: [VOTE][RUST] Release Apache Arrow Rust 11.0.0 RC1

2022-03-18 Thread LM
+1 (non-binding)

Verified on macOS 12.3 on M1Max

Thanks,
Lin

On Fri, Mar 18, 2022 at 5:52 PM QP Hou  wrote:

> +1 (binding)
> Thanks,
> QP Hou
>
> On Fri, Mar 18, 2022 at 1:01 AM Andrew Lamb  wrote:
> >
> > Hi,
> >
> > I would like to propose a release of Apache Arrow Rust Implementation,
> > version 11.0.0.
> >
> > This release candidate is based on commit:
> > 5d6b638111e3f9c72dc8504ea98e46914fc93af5 [1]
> >
> > The proposed release tarball and signatures are hosted at [2].
> >
> > The changelog is located at [3].
> >
> > Please download, verify checksums and signatures, run the unit tests,
> > and vote on the release. There is a script [4] that automates some of
> > the verification.
> >
> > The vote will be open for at least 72 hours.
> >
> > [ ] +1 Release this as Apache Arrow Rust
> > [ ] +0
> > [ ] -1 Do not release this as Apache Arrow Rust  because...
> >
> > [1]:
> >
> https://github.com/apache/arrow-rs/tree/5d6b638111e3f9c72dc8504ea98e46914fc93af5
> > [2]:
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-rs-11.0.0-rc1
> > [3]:
> >
> https://github.com/apache/arrow-rs/blob/5d6b638111e3f9c72dc8504ea98e46914fc93af5/CHANGELOG.md
> > [4]:
> >
> https://github.com/apache/arrow-rs/blob/master/dev/release/verify-release-candidate.sh
> > -
>


Re: [VOTE][RUST] Release Apache Arrow Rust 11.0.0 RC1

2022-03-18 Thread QP Hou
+1 (binding)
Thanks,
QP Hou

On Fri, Mar 18, 2022 at 1:01 AM Andrew Lamb  wrote:
>
> Hi,
>
> I would like to propose a release of Apache Arrow Rust Implementation,
> version 11.0.0.
>
> This release candidate is based on commit:
> 5d6b638111e3f9c72dc8504ea98e46914fc93af5 [1]
>
> The proposed release tarball and signatures are hosted at [2].
>
> The changelog is located at [3].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. There is a script [4] that automates some of
> the verification.
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow Rust
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow Rust  because...
>
> [1]:
> https://github.com/apache/arrow-rs/tree/5d6b638111e3f9c72dc8504ea98e46914fc93af5
> [2]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-rs-11.0.0-rc1
> [3]:
> https://github.com/apache/arrow-rs/blob/5d6b638111e3f9c72dc8504ea98e46914fc93af5/CHANGELOG.md
> [4]:
> https://github.com/apache/arrow-rs/blob/master/dev/release/verify-release-candidate.sh
> -


RE: [VOTE][RUST] Release Apache Arrow Rust 11.0.0 RC1

2022-03-18 Thread Matthew Turner
+1 [non-binding]
Verified on M1 Mac.

Thanks, Andrew.

-Original Message-
From: Andy Grove  
Sent: Friday, March 18, 2022 12:47 PM
To: dev 
Subject: Re: [VOTE][RUST] Release Apache Arrow Rust 11.0.0 RC1

+1 (binding)

Verified on Ubuntu 20.04.3 LTS

On Fri, Mar 18, 2022 at 2:01 AM Andrew Lamb  wrote:

> Hi,
>
> I would like to propose a release of Apache Arrow Rust Implementation, 
> version 11.0.0.
>
> This release candidate is based on commit:
> 5d6b638111e3f9c72dc8504ea98e46914fc93af5 [1]
>
> The proposed release tarball and signatures are hosted at [2].
>
> The changelog is located at [3].
>
> Please download, verify checksums and signatures, run the unit tests, 
> and vote on the release. There is a script [4] that automates some of 
> the verification.
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow Rust [ ] +0 [ ] -1 Do not release 
> this as Apache Arrow Rust  because...
>
> [1]:
>
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> ub.com%2Fapache%2Farrow-rs%2Ftree%2F5d6b638111e3f9c72dc8504ea98e46914f
> c93af5data=04%7C01%7C%7C5d80e46573874b5166e108da08fef67e%7C84df9e
> 7fe9f640afb435%7C1%7C0%7C637832188403841050%7CUnknown%7CTW
> FpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6
> Mn0%3D%7C3000sdata=VZEDJlH%2B2ItQ2wPRbtR2rdTU6VzjajGZJUrAODmkEY8%
> 3Dreserved=0
> [2]:
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist
> .apache.org%2Frepos%2Fdist%2Fdev%2Farrow%2Fapache-arrow-rs-11.0.0-rc1&
> amp;data=04%7C01%7C%7C5d80e46573874b5166e108da08fef67e%7C84df9e7fe9f64
> 0afb435%7C1%7C0%7C637832188403841050%7CUnknown%7CTWFpbGZsb
> 3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C3000sdata=BjXFdz6yODO8Sc54Avvq9uD%2FUb%2FlvtuhCpk0eOQJYaI%3D
> p;reserved=0
> [3]:
>
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> ub.com%2Fapache%2Farrow-rs%2Fblob%2F5d6b638111e3f9c72dc8504ea98e46914f
> c93af5%2FCHANGELOG.mddata=04%7C01%7C%7C5d80e46573874b5166e108da08
> fef67e%7C84df9e7fe9f640afb435%7C1%7C0%7C637832188403841050
> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6I
> k1haWwiLCJXVCI6Mn0%3D%7C3000sdata=uAxoaiRtcrfZH2XlDBSYLMHSEt%2Fa9
> OuYp6fRIYV4v3I%3Dreserved=0
> [4]:
>
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> ub.com%2Fapache%2Farrow-rs%2Fblob%2Fmaster%2Fdev%2Frelease%2Fverify-re
> lease-candidate.shdata=04%7C01%7C%7C5d80e46573874b5166e108da08fef
> 67e%7C84df9e7fe9f640afb435%7C1%7C0%7C637832188403841050%7C
> Unknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1h
> aWwiLCJXVCI6Mn0%3D%7C3000sdata=Fu2OpGw8ucRh3%2BzY2VQjUzy7T3vp3iJi
> HNRnDB3Vhaw%3Dreserved=0
> -
>


Re: Arrow in HPC

2022-03-18 Thread David Li
For anyone interested, the PR is finally up and ready: 
https://github.com/apache/arrow/pull/12442

As part of this, Flight in C++ was refactored to allow plugging in alternative 
transports. There's more work to be done there (auth, middleware, etc. need to 
be uplifted into the common layer), but this should enable UCX and potentially 
other network transports.

There's still some caveats as described in the PR itself, including some edge 
cases I need to track down and missing support for a variety of features, but 
the core data plane methods are supported and the Flight benchmark can be run.

Thanks to Yibo Cai, Pavel Shamis, Antoine Pitrou (among others) for assistance 
and review, and the HPC Advisory Council for granting access to an HPC cluster 
to help with development and testing.

On Tue, Jan 18, 2022, at 18:33, David Li wrote:
> Ah, yes, thanks for the reminder. That's one of the things that needs 
> to be addressed for sure.
>
> -David
>
> On Tue, Jan 18, 2022, at 17:48, Supun Kamburugamuve wrote:
>> One general observation. I think this implementation uses the polling to
>> check the progress. Because of the client server semantics of Arrow Flight,
>> you may need to use an interrupt based polling like epoll to avoid the busy
>> looping.
>> 
>> Best,
>> Supun..
>> 
>> On Tue, Jan 18, 2022 at 8:13 AM David Li  wrote:
>> 
>> > Thanks for those results, Yibo! Looks like there's still more room for
>> > improvement here. Yes, things are a little unstable, though I didn't
>> > get that much trouble trying to just start the benchmark - I will need
>> > to find suitable hardware and iron out these issues. Note that I've
>> > only implemented DoGet, and I haven't implemented concurrent streams,
>> > which would explain why most benchmark configurations hang or error.
>> >
>> > Since the last time, I've rewritten the prototype to use UCX's "active
>> > message" functionality instead of trying to implement messages over
>> > the "streams" API. This simplified the code. I also did some
>> > refactoring along the lines of Yibo's prototype to share more code
>> > between the gRPC and UCX implementations. Here are some benchmark
>> > numbers:
>> >
>> > For IPC (server/client on the same machine): UCX with shared memory
>> > handily beats gRPC here. UCX with TCP isn't quite up to par, though.
>> >
>> > gRPC:
>> > 128KiB batches: 4463 MiB/s
>> > 2MiB batches:   3537 MiB/s
>> > 32MiB batches:  1828 MiB/s
>> >
>> > UCX (shared memory):
>> > 128KiB batches: 6500 MiB/s
>> > 2MiB batches:  13879 MiB/s
>> > 32MiB batches:  9045 MiB/s
>> >
>> > UCX (TCP):
>> > 128KiB batches: 1069 MiB/s
>> > 2MiB batches:   1735 MiB/s
>> > 32MiB batches:  1602 MiB/s
>> >
>> > For RPC (server/client on different machines): Two t3.xlarge (4 core,
>> > 16 thread) machines were used in AWS EC2. These have "up to" 5Gbps
>> > bandwidth. This isn't really a scenario where UCX is expected to
>> > shine, however, UCX performs comparably to gRPC here.
>> >
>> > gRPC:
>> > 128 KiB batches: 554 MiB/s
>> > 2 MiB batches:   575 MiB/s
>> >
>> > UCX:
>> > 128 KiB batches: 546 MiB/s
>> > 2 MiB batches:   567 MiB/s
>> >
>> > Raw test logs can be found here:
>> > https://gist.github.com/lidavidm/57d8a3cba46229e4d277ae0730939acc
>> >
>> > For IPC, the shared memory results are promising in that it could be
>> > feasible to expose a library purely over Flight without worrying about
>> > FFI bindings. Also, it seems results are roughly comparable to what
>> > Yibo observed in ARROW-15282 [1] meaning UCX will get us both a
>> > performant shared memory transport and support for more exotic
>> > hardware.
>> >
>> > There's still much work to be done; at this point, I'd like to start
>> > implementing the rest of the Flight methods, fixing up the many TODOs
>> > scattered around, trying to refactor more things to share code between
>> > gRPC/UCX, and find and benchmark some hardware that UCX has a fast
>> > path for.
>> >
>> > [1]: https://issues.apache.org/jira/browse/ARROW-15282
>> >
>> > -David
>> >
>> > On Tue, Jan 18, 2022, at 04:35, Yibo Cai wrote:
>> > > Some updates.
>> > >
>> > > I tested David's UCX transport patch over 100Gb network. FlightRPC over
>> > > UCX/RDMA improves throughput about 50%, with lower and flat latency.
>> > > And I think there are chances to improve further. See test report [1].
>> > >
>> > > For the data plane approach, the PoC shared memory data plane also
>> > > introduces significantly performance boost. Details at [2].
>> > >
>> > > Glad to see there are big potentials to improve FlightRPC performance.
>> > >
>> > > [1] https://issues.apache.org/jira/browse/ARROW-15229
>> > > [2] https://issues.apache.org/jira/browse/ARROW-15282
>> > >
>> > > On 12/30/21 11:57 PM, David Li wrote:
>> > > > Ah, I see.
>> > > >
>> > > > I think both projects can proceed as well. At some point we will have
>> > to figure out how to merge them, but I think it's too early to see how
>> > exactly we will want to refactor things.
>> > > >
>> 

Re: [VOTE][RUST] Release Apache Arrow Rust 11.0.0 RC1

2022-03-18 Thread Andy Grove
+1 (binding)

Verified on Ubuntu 20.04.3 LTS

On Fri, Mar 18, 2022 at 2:01 AM Andrew Lamb  wrote:

> Hi,
>
> I would like to propose a release of Apache Arrow Rust Implementation,
> version 11.0.0.
>
> This release candidate is based on commit:
> 5d6b638111e3f9c72dc8504ea98e46914fc93af5 [1]
>
> The proposed release tarball and signatures are hosted at [2].
>
> The changelog is located at [3].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. There is a script [4] that automates some of
> the verification.
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow Rust
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow Rust  because...
>
> [1]:
>
> https://github.com/apache/arrow-rs/tree/5d6b638111e3f9c72dc8504ea98e46914fc93af5
> [2]:
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-rs-11.0.0-rc1
> [3]:
>
> https://github.com/apache/arrow-rs/blob/5d6b638111e3f9c72dc8504ea98e46914fc93af5/CHANGELOG.md
> [4]:
>
> https://github.com/apache/arrow-rs/blob/master/dev/release/verify-release-candidate.sh
> -
>


[VOTE][RUST] Release Apache Arrow Rust 11.0.0 RC1

2022-03-18 Thread Andrew Lamb
Hi,

I would like to propose a release of Apache Arrow Rust Implementation,
version 11.0.0.

This release candidate is based on commit:
5d6b638111e3f9c72dc8504ea98e46914fc93af5 [1]

The proposed release tarball and signatures are hosted at [2].

The changelog is located at [3].

Please download, verify checksums and signatures, run the unit tests,
and vote on the release. There is a script [4] that automates some of
the verification.

The vote will be open for at least 72 hours.

[ ] +1 Release this as Apache Arrow Rust
[ ] +0
[ ] -1 Do not release this as Apache Arrow Rust  because...

[1]:
https://github.com/apache/arrow-rs/tree/5d6b638111e3f9c72dc8504ea98e46914fc93af5
[2]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-rs-11.0.0-rc1
[3]:
https://github.com/apache/arrow-rs/blob/5d6b638111e3f9c72dc8504ea98e46914fc93af5/CHANGELOG.md
[4]:
https://github.com/apache/arrow-rs/blob/master/dev/release/verify-release-candidate.sh
-