Xianjin YE created PARQUET-2045: ----------------------------------- Summary: ConsecutiveChunkList's length field should be long instead of int Key: PARQUET-2045 URL: https://issues.apache.org/jira/browse/PARQUET-2045 Project: Parquet Issue Type: Bug Reporter: Xianjin YE Assignee: Xianjin YE Attachments: image-2021-05-10-17-12-00-083.png, image-2021-05-10-17-14-45-401.png
Hi, we encountered some read failure for large column chunk(size > Int.MaxValue). After some debugging, the buggy code is that ConsecutiveChunkList's length field is int, and it overflows when the uncompressed size of one ColumnChunk is large than Int.MaxValue. Below is the exception stack: !image-2021-05-10-17-12-00-083.png! The column size is some what: !image-2021-05-10-17-14-45-401.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)