pitrou opened a new pull request, #47619:
URL: https://github.com/apache/arrow/pull/47619
### Rationale for this change
Parquet CLI tools fail printing the statistics for a Decimal column with a
precision larger than the max Decimal128 precision.
Example:
```console
$ /build/build-test/debug/parquet-reader --only-metadata
/tmp/pqfuzz/pq-table-1
...
Column 5: col_6 (FIXED_LEN_BYTE_ARRAY(11) / Decimal(precision=24, scale=7) /
DECIMAL(24,7))
Column 6: col_7 (FIXED_LEN_BYTE_ARRAY(18) / Decimal(precision=43, scale=7) /
DECIMAL(43,7))
...
Column 5
Values: 375, Null Values: 74, Distinct Values: 0
Max (exact: true): 98505381700645007.0205463, Min (exact: true):
-99708959786297168.1726196
Compression: UNCOMPRESSED, Encodings: PLAIN(DICT_PAGE) RLE_DICTIONARY
Uncompressed Size: 3754, Compressed Size: 3754
Column 6
Values: 375, Null Values: 69, Distinct Values: 0
Max (exact: true): Parquet error: Failed to parse decimal value: Length of
byte array passed to Decimal128::FromBigEndian was 18, but must be between 1
and 16
...
```
### What changes are included in this PR?
Use Decimal256 instead of Decimal128 when printing a Decimal statistic.
### Are these changes tested?
Yes, by new tests.
### Are there any user-facing changes?
No.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]