[PR] feat(datafusion): enable parallel file-level scanning via one partition per file [iceberg-rust]

via GitHub Tue, 31 Mar 2026 04:43:26 -0700


toutane opened a new pull request, #2298:
URL: https://github.com/apache/iceberg-rust/pull/2298


   ## Which issue does this PR close?
   
   - Closes: #2220
   - Related: #1604 (DataFusion read performance bottlenecked by 
single-threaded execution), #128 (size-based planning, proposed long-term 
solution)
   
   ## What changes are included in this PR?
   
   ### Approach
   
   The issue proposed modifying `IcebergTableScan` directly to accept 
`Vec<Vec<FileScanTask>>` and return `UnknownPartitioning(n)`. This PR takes a 
different approach: rather than changing the existing scan path, it introduces 
two new types. This preserves full backward compatibility with 
`IcebergTableProvider` / `IcebergTableScan` and lets users explicitly choose 
parallel file scanning when they need it.
   
   **Adds two new public types to `iceberg-datafusion`:**
   
   - `IcebergPartitionedScan`: a DataFusion `ExecutionPlan` where each 
`FileScanTask` maps to exactly one partition, enabling DataFusion to dispatch 
file reads in parallel
   
   - `IcebergPartitionedTableProvider`: a catalog-backed `TableProvider` that 
builds an `IcebergPartitionedScan` on every query, always fetching the latest 
snapshot
   
   ### Design choices
   
   - **One file = one partition**
   `IcebergTableScan` uses `UnknownPartitioning(1)` and streams all files 
sequentially through a single partition. `IcebergPartitionedScan` uses 
`UnknownPartitioning(n_files)`, giving DataFusion the information it needs to 
schedule `execute(i)` calls concurrently, one per file.
   
   - **Table reloaded on every scan**
   `IcebergPartitionedTableProvider` loads the table twice: once at 
construction to snapshot the Arrow schema for DataFusion planning, and once at 
scan time to guarantee the freshest snapshot. This mirrors the behavior of 
`IcebergTableProvider`.
   
   - **No stored projection/predicate fields**
   The struct is intentionally self-contained: its full state is `(tasks, 
file_io, schema)`.
   
   ### Known limitations
   
   - No limit pushdown:  `_limit` is not forwarded to `IcebergPartitionedScan`. 
DataFusion inserts a `GlobalLimitExec` above any leaf that does not implement 
pushdown, so correctness is maintained
   
   - No writes: `insert_into` returns `FeatureUnsupported`. Use 
`IcebergTableProvider` for write operations
   
   - Schema staleness on projection: projection indices are resolved against 
the schema captured at construction time. This is inherited behavior from 
`IcebergTableProvider`
   
   ## Are these changes tested?
   
   Two unit tests are added in `table/partitioned.rs`:
   
   - `test_empty_table_zero_partitions`: verifies that an empty table produces 
a zero-partition scan, guarding against an out-of-bounds panic on `execute(0)`
   
   - `test_one_partition_per_file`: verifies that N data files produce exactly 
N DataFusion partitions in `IcebergPartitionedScan`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] feat(datafusion): enable parallel file-level scanning via one partition per file [iceberg-rust]

Reply via email to