Hi,

I would like to bring to this mailing list a proposal to donate the source
code of arrow2 [1] and parquet2 [2] as experimental repositories [3] within
Apache Arrow, conditional on IP clearance.

The specific PRs are:

* https://github.com/apache/arrow-experimental-rs-arrow2/pull/1
* https://github.com/apache/arrow-experimental-rs-parquet2/pull/1

The source code contains rewrites of the arrow and parquet crates with
safety and security in mind. In particular,

* no buffer transmutes
* no unsafe APIs marked as safe
* parquet's implementation is unsafe free

There are many other important features, such as big endian support and IPC
2.0 support. There is one regression over latest: support nested types in
parquet read and write. I observe no negative impact on performance.

See a longer discussion in [4] over the reasons why the current rust
implementation is susceptible to safety violations. In particular, many
core APIs of the crate are considered security vulnerabilities under
RustSec's [5] definitions, and are difficult to address on its current
design.

I validated that it is possible to migrate DataFusion [6] and Polars [7]
without further code changes.

The vote will be open for at least 72 hours.

[ ] +1 Accept the code donation as experimental repos.
[ ] +0
[ ] -1 Do not accept the code donation as experimental repos because...

[1]
https://github.com/apache/arrow/blob/master/docs/source/developers/experimental_repos.rst
[2] https://github.com/jorgecarleitao/arrow2
[3] https://github.com/jorgecarleitao/parquet2
[4] https://github.com/jorgecarleitao/arrow2#faq
[5] https://rustsec.org/
[6] https://github.com/apache/arrow-datafusion/pull/68
[7] https://github.com/pola-rs/polars

Reply via email to