Cheappie opened a new issue, #2161:
URL: https://github.com/apache/arrow-datafusion/issues/2161

   **Describe the bug**
   Simply I get index out of bounds when parquet pruning is enabled.
   
   file: metadata.rs:212:10
   struct: RowGroupMetaData, 
   accessed field: columns
   error: thread 'tokio-runtime-worker' panicked at 'index out of bounds: the 
len is 1 but the index is 1'
   
   **To Reproduce**
   Create two parquet files with different fields in schema, I put 4 numbers 
into each file.
   
   ```
   file: sample1.parquet
   message schema {
       REQUIRED INT32 a;
   }
   
   file: sample2.parquet
   message schema {
       REQUIRED INT32 b;
   }
   ```
   
   code:
   ```
   #[tokio::main]
   async fn main() -> Result<()> {
       // create local execution context
       let mut ctx = ExecutionContext::new();
   
       // Configure listing options
       let file_format = ParquetFormat::default().with_enable_pruning(true);
       let listing_options = ListingOptions {
           file_extension: DEFAULT_PARQUET_EXTENSION.to_owned(),
           format: Arc::new(file_format),
           table_partition_cols: vec![],
           collect_stat: false,
           target_partitions: 1,
       };
   
       ctx.register_listing_table(
           "FANCY_TABLE",
           "file:///absolute-path/table/",
           listing_options,
           None,
       ).await.unwrap();
   
       let df = ctx
           .sql("SELECT * FROM FANCY_TABLE where a > 2 or b > 2")
           .await?;
   
       df.show().await?;
   
       Ok(())
   }
   ```
   
   **Expected behavior**
   Query executes without any issues.
   
   When pruning is disabled, everything is fine and I receive such result.
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to