[ https://issues.apache.org/jira/browse/ARROW-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Morgan Cassels updated ARROW-13120: ----------------------------------- Component/s: Rust > [Rust][Parquet] Cannot read multiple batches from parquet with string list > column > --------------------------------------------------------------------------------- > > Key: ARROW-13120 > URL: https://issues.apache.org/jira/browse/ARROW-13120 > Project: Apache Arrow > Issue Type: Bug > Components: Rust > Reporter: Morgan Cassels > Priority: Major > Attachments: test.parquet > > > This issue only occurs when the batch size < the number of rows in the table. > The attached parquet `test.parquet` has 31430 rows and a single column > containing string lists. This issue does not appear to occur for parquets > with integer list columns. > > {code:java} > #[test] > fn failing_test() { > let parquet_file_reader = get_test_reader("test.parquet"); > let mut arrow_reader = ParquetFileArrowReader::new(parquet_file_reader); > let mut record_batches = Vec::new(); > let record_batch_reader = arrow_reader.get_record_reader(1024).unwrap(); > for batch in record_batch_reader { > record_batches.push(batch); > } > } > {code} > > {code:java} > ---- arrow::arrow_reader::tests::failing_test stdout ---- > thread 'arrow::arrow_reader::tests::failing_test' panicked at 'Expected > infallable creation of GenericListArray from ArrayDataRef failed: > InvalidArgumentError("offsets do not start at zero")', > arrow/src/array/array_list.rs:195:45 > note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)