[ https://issues.apache.org/jira/browse/ARROW-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neville Dipale reassigned ARROW-11077: -------------------------------------- Assignee: Neville Dipale > [Rust] ParquetFileArrowReader panicks when trying to read nested list > --------------------------------------------------------------------- > > Key: ARROW-11077 > URL: https://issues.apache.org/jira/browse/ARROW-11077 > Project: Apache Arrow > Issue Type: Bug > Components: Rust > Reporter: Ben Sully > Assignee: Neville Dipale > Priority: Major > Attachments: small-nested-lists.parquet > > > I think this is documented in the code, but I can't be 100% sure. > When trying to execute a DataFusion query over a Parquet file where one field > is a struct with a nested list, the thread panicks due to unwrapping on an > `Option::None` [at this > point|https://github.com/apache/arrow/blob/36d80e37373ab49454eb47b2a89c10215ca1b67e/rust/parquet/src/arrow/array_reader.rs#L1334-L1337] > > [.|https://github.com/apache/arrow/blob/36d80e37373ab49454eb47b2a89c10215ca1b67e/rust/parquet/src/arrow/array_reader.rs#L1334-L1337].] > This `None` is returned by > [`visit_primitive`|https://github.com/apache/arrow/blob/master/rust/parquet/src/arrow/array_reader.rs#L1243-L1245], > but I can't quite make sense of _why_ it returns a `None` rather than an > error? > I added a couple of dbg! calls to see what the item_type and list_type are: > {code} > [/home/ben/repos/rust/arrow/rust/parquet/src/arrow/array_reader.rs:1339] > &item_type = PrimitiveType { > basic_info: BasicTypeInfo { > name: "item", > repetition: Some( > OPTIONAL, > ), > logical_type: UTF8, > id: None, > }, > physical_type: BYTE_ARRAY, > type_length: -1, > scale: -1, > precision: -1, > } > [/home/ben/repos/rust/arrow/rust/parquet/src/arrow/array_reader.rs:1340] > &list_type = GroupType { > basic_info: BasicTypeInfo { > name: "tags", > repetition: Some( > OPTIONAL, > ), > logical_type: LIST, > id: None, > }, > fields: [ > GroupType { > basic_info: BasicTypeInfo { > name: "list", > repetition: Some( > REPEATED, > ), > logical_type: NONE, > id: None, > }, > fields: [ > PrimitiveType { > basic_info: BasicTypeInfo { > name: "item", > repetition: Some( > OPTIONAL, > ), > logical_type: UTF8, > id: None, > }, > physical_type: BYTE_ARRAY, > type_length: -1, > scale: -1, > precision: -1, > }, > ], > }, > ], > }{code} > I guess we should at least use `.expect` here instead of `.unwrap` so it's > more clear why this is happening! -- This message was sent by Atlassian Jira (v8.3.4#803005)