+1 that this is beneficial, especially for 16 bit floats On Mon, Jan 8, 2024, 11:56 Antoine Pitrou <[email protected]> wrote:
> > Hello all, > > Based on the response received, it seems this addition is > non-controversial and generally considered beneficial. > > What should be the way forward? Should I submit a format update > and then one or two implementations thereof? > > Regards > > Antoine. > > > On Sun, 7 Jan 2024 23:40:11 -0800 > Micah Kornfield <[email protected]> > wrote: > > I responded there but generally, this doesn't seem like it imposes a lot > of > > implementation burden and can be useful. > > > > On Thu, Dec 14, 2023 at 12:59 PM Antoine Pitrou < > [email protected]> wrote: > > > > > > > > Hello, > > > > > > Just a heads up here so as to reach a wider audience: I've posted a > > > format addition proposal in > > > https://issues.apache.org/jira/browse/PARQUET-2414 > > > > > > Excerpt: > > > """ > > > This issue proposed to widen the types supported by the > > > BYTE_STREAM_SPLIT. By allowing the BYTE_STREAM_SPLIT on any > > > FIXED_LEN_BYTE_ARRAY column, we can automatically improve compression > > > efficiency on various column types including: > > > > > > half-float data > > > fixed-width decimal data > > > > > > [etc.] > > > """ > > > > > > Feel free to comment here or on the JIRA issue. > > > > > > Regards > > > > > > Antoine. > > > > > > > > > > > > > > >
