[
https://issues.apache.org/jira/browse/PARQUET-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gang Wu resolved PARQUET-2420.
------------------------------
Fix Version/s: 1.14.0
Resolution: Fixed
> ThriftParquetWriter converts thrift byte to int32 without adding logical type
> ------------------------------------------------------------------------------
>
> Key: PARQUET-2420
> URL: https://issues.apache.org/jira/browse/PARQUET-2420
> Project: Parquet
> Issue Type: Bug
> Components: parquet-thrift
> Reporter: Shreyas B
> Assignee: Shreyas B
> Priority: Major
> Fix For: 1.14.0
>
>
> The current implementation of Parquet serialisation from Thrift Definitions
> results in the incorrect conversion of Thrift byte fields into INT32 without
> preserving the required LogicalType Metadata in the Parquet file. This
> behaviour leads to a loss of information and is inconsistent with the
> expected behaviour. The correct conversion should result in INT32 with
> LogicalType metadata indicating a bit width of 8 and signed as true.
>
> Thrift Definition
> {code:java}
> struct TestLogicalType {
> 1: required i16 test_i16,
> 2: required byte test_i8
> } {code}
> Current Parquet Schema Generated
> {code:java}
> message ParquetSchema {
> required int32 test_i16 (INTEGER(16,true)) = 1;
> required int32 test_i8 = 2;
> } {code}
> Expected Parquet Schema
> {code:java}
> message ParquetSchema {
> required int32 test_i16 (INTEGER(16,true)) = 1;
> required int32 test_i8 (INTEGER(8,true)) = 2;
> } {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]