Hi, I would like to bring to this mailing list a proposal to donate the source code of arrow2 [1] and parquet2 [2] as experimental repositories [3] within Apache Arrow, conditional on IP clearance.
The specific PRs are: * https://github.com/apache/arrow-experimental-rs-arrow2/pull/1 * https://github.com/apache/arrow-experimental-rs-parquet2/pull/1 The source code contains rewrites of the arrow and parquet crates with safety and security in mind. In particular, * no buffer transmutes * no unsafe APIs marked as safe * parquet's implementation is unsafe free There are many other important features, such as big endian support and IPC 2.0 support. There is one regression over latest: support nested types in parquet read and write. I observe no negative impact on performance. See a longer discussion in [4] over the reasons why the current rust implementation is susceptible to safety violations. In particular, many core APIs of the crate are considered security vulnerabilities under RustSec's [5] definitions, and are difficult to address on its current design. I validated that it is possible to migrate DataFusion [6] and Polars [7] without further code changes. The vote will be open for at least 72 hours. [ ] +1 Accept the code donation as experimental repos. [ ] +0 [ ] -1 Do not accept the code donation as experimental repos because... [1] https://github.com/apache/arrow/blob/master/docs/source/developers/experimental_repos.rst [2] https://github.com/jorgecarleitao/arrow2 [3] https://github.com/jorgecarleitao/parquet2 [4] https://github.com/jorgecarleitao/arrow2#faq [5] https://rustsec.org/ [6] https://github.com/apache/arrow-datafusion/pull/68 [7] https://github.com/pola-rs/polars