+1 (non-binding) On Thu, Mar 7, 2024 at 1:25 PM wish maple <maplewish...@gmail.com> wrote:
> +1 (non-binding) > > Best, > Xuwei Fu > > Antoine Pitrou <anto...@python.org> 于2024年3月7日周四 21:18写道: > > > > > Hello, > > > > As discussed previously on this ML [1], I am proposing to expand > > the types supported by the BYTE_STREAM_SPLIT encoding. The currently > > supported types are FLOAT and DOUBLE. The proposal expands the > > supported types to INT32, INT64 and FIXED_LEN_BYTE_ARRAY. > > > > The format addition is tracked on JIRA where some measurements on > > sample data are also published and discussed [2]. > > > > (please note that the original ML thread only discussed expanding > > to FIXED_LEN_BYTE_ARRAY; discussion on the JIRA issue led to the > > conclusion that it would also be beneficial to cover INT32 and INT64) > > > > The format additions are submitted as a PR in [3]. > > A data file for integration testing is submitted in [4]. > > An implementation for Parquet C++ is submitted in [5]. > > An implementation for parquet-mr is submitted in [6]. > > > > This vote will be open for at least 1 week. > > > > +1: Accept the format additions > > +0: ... > > -1: Reject the format additions because ... > > > > Regards > > > > Antoine. > > > > > > [1] https://lists.apache.org/thread/5on7rnc141jnw2cdxtsfgm5xhhdmsb4q > > [2] https://issues.apache.org/jira/browse/PARQUET-2414 > > [3] https://github.com/apache/parquet-format/pull/229 > > [4] https://github.com/apache/parquet-testing/pull/46 > > [5] https://github.com/apache/arrow/pull/40094 > > [6] https://github.com/apache/parquet-mr/pull/1291 > > > > > > > > >