[ https://issues.apache.org/jira/browse/ARROW-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Antoine Pitrou updated ARROW-17465: ----------------------------------- Component/s: C++ Parquet > [Parquet] DELTA_BINARY_PACKED constraint on num_bits is too restrict? > --------------------------------------------------------------------- > > Key: ARROW-17465 > URL: https://issues.apache.org/jira/browse/ARROW-17465 > Project: Apache Arrow > Issue Type: Bug > Components: C++, Parquet > Reporter: Jorge Leitão > Priority: Major > > Consider the sequence of (int32) values > [863490391,-816295192,1613070492,-1166045478,1856530847] > This sequence can be encoded as a single block, single miniblock with a > bit_width of 33. > However, we currently require [1] the bit_width of each miniblock to be > smaller than the bitwidth of the type it encodes. > We could consider lifting this constraint, as, as shown in the example above, > the values representation's `bit_width` can be smaller than the delta's > representation's `bit_width`. > [1] > https://github.com/apache/arrow/blob/a376968089d7310f4a88d054822fa1eaf96c46f5/cpp/src/parquet/encoding.cc#L2173 -- This message was sent by Atlassian Jira (v8.20.10#820010)