+1 > What should be the way forward? Should I submit a format update and then one or two implementations thereof?
Based on my observation of recent format changes, it usually follows the steps below: (1) A PR for a format change. (2) Two PRs of PoC implementation for feature and interoperability. (3) A formal vote to ML. Best, Gang On Tue, Jan 9, 2024 at 7:16 AM Martin Loncaric <[email protected]> wrote: > +1 that this is beneficial, especially for 16 bit floats > > On Mon, Jan 8, 2024, 11:56 Antoine Pitrou <[email protected]> wrote: > > > > > Hello all, > > > > Based on the response received, it seems this addition is > > non-controversial and generally considered beneficial. > > > > What should be the way forward? Should I submit a format update > > and then one or two implementations thereof? > > > > Regards > > > > Antoine. > > > > > > On Sun, 7 Jan 2024 23:40:11 -0800 > > Micah Kornfield <[email protected]> > > wrote: > > > I responded there but generally, this doesn't seem like it imposes a > lot > > of > > > implementation burden and can be useful. > > > > > > On Thu, Dec 14, 2023 at 12:59 PM Antoine Pitrou < > > [email protected]> wrote: > > > > > > > > > > > Hello, > > > > > > > > Just a heads up here so as to reach a wider audience: I've posted a > > > > format addition proposal in > > > > https://issues.apache.org/jira/browse/PARQUET-2414 > > > > > > > > Excerpt: > > > > """ > > > > This issue proposed to widen the types supported by the > > > > BYTE_STREAM_SPLIT. By allowing the BYTE_STREAM_SPLIT on any > > > > FIXED_LEN_BYTE_ARRAY column, we can automatically improve compression > > > > efficiency on various column types including: > > > > > > > > half-float data > > > > fixed-width decimal data > > > > > > > > [etc.] > > > > """ > > > > > > > > Feel free to comment here or on the JIRA issue. > > > > > > > > Regards > > > > > > > > Antoine. > > > > > > > > > > > > > > > > > > > > > > > >
