Nishanth created ARROW-17452: -------------------------------- Summary: Arrow-Parquet c++ errors out OSError: Malformed levels min: 3 max: 3 out of range. Max Level: 2 Key: ARROW-17452 URL: https://issues.apache.org/jira/browse/ARROW-17452 Project: Apache Arrow Issue Type: Bug Components: C++ Affects Versions: 9.0.0 Reporter: Nishanth Attachments: athena_struct.gz.parquet
Current Arrow-Parquet c++ errors out on some files with error {code:java} OSError: Malformed levels min: 3 max: 3 out of range. Max Level: 2{code} This is noticed particularly in Parquet columns with nested data structures. The source of the exception is a check which checks the min / max is respected on what the column has defined. [https://github.com/apache/arrow/blob/master/cpp/src/parquet/column_reader.cc#L177|http://example.com/] The parquet files were created in Athena using the following query and read with arrow-parquet c++. {code:java} create table struct_athena (int1 int, struct1 struct<field1: string, field2: string>) LOCATION 's3://' TBLPROPERTIES ( 'table_type'='ICEBERG', 'format'='parquet' ); insert into struct_athena VALUES (1, (CAST(ROW('one', 'two') AS ROW(field1 varchar, field2 varchar)))); {code} The generated parquet file is attached in the JIRA. -- This message was sent by Atlassian Jira (v8.20.10#820010)